Home Ask Login Register

Developers Planet

Your answer is one click away!

haimen February 2016

Append a list of dataframes into one inside a loop in Python

Suppose I have a dataframe and I want to split the dataframe for performing K fold cross validation. I know that there are packages available to do this. But I am trying to write the code inorder to learn few things. I have tried the following code, where I get the parameter for K and split the data into K parts and save it to df_array. Now for each iteration I want to have one as test and remaining as training data. I am able to substitute one as test in validation_data variable. But the training data is having an list of remaining 9 dataframes. I want to append that into one so that I can apply my model to it. Can anybody help me in doing this?


def k_fold_cross_validation(data,K):
    df_array = [ data[i::K]for i in xrange(K)]
    print df_array
    for i,val in enumerate(df_array):
        validation_data = pd.DataFrame(df_array[i])
        print "validation "
        print validation_data
        training_data_list = df_array[:i] + df_array[i+1:]
        print "training"
        print training_data_list


My output should by validation 0 training as a dataframe with 1,2,3,...9 values. and for the next iteration, validation 1 and training as a dataframe with 0,2,3,...9 and it goes on.

Can anybody help me in doing this?


howMuchCheeseIsTooMuchCheese February 2016

  training_data_list = df_array[:i] + df_array[i+1:]
  for df in training_data_list:

mDF will now have all the data in the list of DF's

Post Status

Asked in February 2016
Viewed 2,835 times
Voted 4
Answered 1 times


Leave an answer

Quote of the day: live life