Suppose I have a dataframe and I want to split the dataframe for performing K fold cross validation. I know that there are packages available to do this. But I am trying to write the code inorder to learn few things. I have tried the following code, where I get the parameter for K and split the data into K parts and save it to df_array. Now for each iteration I want to have one as test and remaining as training data. I am able to substitute one as test in validation_data variable. But the training data is having an list of remaining 9 dataframes. I want to append that into one so that I can apply my model to it. Can anybody help me in doing this?
df_array = [ data[i::K]for i in xrange(K)]
for i,val in enumerate(df_array):
validation_data = pd.DataFrame(df_array[i])
print "validation "
training_data_list = df_array[:i] + df_array[i+1:]
My output should by validation 0 training as a dataframe with 1,2,3,...9 values. and for the next iteration, validation 1 and training as a dataframe with 0,2,3,...9 and it goes on.
Can anybody help me in doing this?