JJSmith February 2016

Pandas Passing Variable Names into Column Name

I have a datafram that contains 13 different column names, I have separated these headings into two lists. I now want to preform different operations on each of these lists.

Is it possible to pass column names into pandas as variable? My code at the moment can loop through the list fine but i am having trouble trying to pass the column name into the function

Code

CONT = ['age','fnlwgt','capital-gain','capital-loss']
#loops through columns
for column_name, column in df.transpose().iterrows():
    if column_name in CONT:
        X = column_name
        print(df.X.count())
    else:
        print('')

Answers


aiguofer February 2016

try:

for column_name, column in df.transpose().iterrows(): 
    if column_name in CONT:
        print(df[column_name].count()) 
    else: 
        print('')

edit:

To answer your question more precisely: You can use variables to select cols in 2 ways: df[list_of_columns] will return a DataFrame with the subset of cols in list_of_columns. df[column_name] will return the Series for column_name


jezrael February 2016

I think you can use subset created from list CONT:

print df
  age fnlwgt  capital-gain
0   a    9th             5
1   b    9th             6
2   c    8th             3

CONT = ['age','fnlwgt']

print df[CONT]
  age fnlwgt
0   a    9th
1   b    9th
2   c    8th

print df[CONT].count()
age       3
fnlwgt    3
dtype: int64

print df[['capital-gain']]
   capital-gain
0             5
1             6
2             3

Maybe better as list is dictionary, which is created by to_dict:

d = df[CONT].count().to_dict()
print d
{'age': 3, 'fnlwgt': 3}
print d['age']
3
print d['fnlwgt']
3


Alexander February 2016

The following will print the count of each column in the dataframe if it is a subset of your CONT list.

CONT = ['age', 'fnlwgt', 'capital-gain', 'capital-loss']
df = pd.DataFrame(np.random.rand(5, 2), columns=CONT[:2])

>>> df
        age    fnlwgt
0  0.079796  0.736956
1  0.120187  0.778335
2  0.698782  0.691850
3  0.421074  0.369500
4  0.125983  0.454247

Select the subset of columns and perform a transform.

>>> df[[c for c in CONT if c in df]].count()
age       5
fnlwgt    5
dtype: int64

Post Status

Asked in February 2016
Viewed 2,343 times
Voted 12
Answered 3 times

Search




Leave an answer