Generating the Cartesian Product of Two DataFrames using Python Pandas



Avoid using .size as it displays the total count of rows multiplied by columns. To verify the success of your cartesian product, you anticipate the result to be 15 rows if df1 has 5 rows and df2 has 3 rows. You can verify this by substituting .size with either .shape or .shape[0].

In your case:

print("dna", df_genes.shape[0])
print("names", df_citations.shape[0])
df_genes['key'] = 0
df_citations['key'] = 0
df = pd.merge(df_genes, df_citations, on='key').drop('key', axis = 1)
print("df before", df.shape[0])

Frequently Asked Questions