Compare 2 columns in python in bulk
TIL moment: Today I Lean how to create a new column to compare two other ones in Python, and it is as simple as a one liner thanks to pandas.
To be honest with you folks, I was going to use numpy, and do some "complicated stuff" ... but it is already in pandas & I am realy glad I found this. I didn't search for something on the Internet this time, I tried it as I thought it could work ... and it did!
The python code to compare to columns and create a third one
The python code would be:
1df['match'] = df['Domain_destination'] == df['Domain_source']
What the output
It will give you a true of false
Domain_destination | Domain_source | match |
---|---|---|
camberlein.com | camberlein.com | true |
camberlein.com | camberlein.fr | false |
This is a perfect example, that could be used for data analysis in general, uncluding SEO analysis.
How to do this in Spreadsheet
Of course, you can still use a formula to to this check, using =EXACT()
... but let be honest, if you are already manipulating your data thanks to Python, it is better to use this one liner ... don't you agree?
The output in Google Spreadsheet would be:
First_url | Second_url | Exact |
---|---|---|
url_a | url_b | =EXACT(A2,B2) |
url_a | url_b | FALSE |
This tips is coming from a tweet ...
... I posted some time ago: https://twitter.com/ArthurCa/status/1534542099890286594