Whether Parquet files are better than CSV files depends on the specific use case and requirements. And also to who you will send the file! When may .csv files be better than .parquet files? CSV files are a simple and widely used file format for storing tabular data. They are...
Difference between python --version and python -v. These two commands have completely different purposes: python --version Shows the Python interpreter's version number Example output: Python 3.9.7 This is the standard way to check which version of Python you're running It's equivalent to python -V (capital V) python -v Activates verbose...
How to leverage Python to fill-out the missing dates on a file? Let's see that together!
Not so long ago, I faced an issue while wanted to test a colleague idea: how can we test on which links user are more willing to click. TL;DR If my article is too long, these are the steps: Define the scope Extract or get the HTML content Split your group...
To read a .parquet file with R, you can use the arrow package, which provides a way to read and write data in the Arrow format, including Parquet. For you, I created an example below: library(arrow) #read the Parquet file into a data.frame data <- read_parquet("example.parquet") #display the data.frame print(data)...
This is the REGEX: [^\/]+$ you would like to use to extract the last part of a URL also known as the handle
I often use this to compare URLs that are localized and end with the same handle.
Sometimes, when you're reading data in a spreadsheet like Microsoft's Excel, it's simpler to put a formula directly into the spreadsheet than to go to R, Python or any other solution.
To save a .parquet file with Python, you can use the pandas library, which provides a convenient way to read and write data in a variety of formats, including Parquet. Let me share with you an example: ```Python import pandas as pd create a sample DataFrame data = {'name': ['Alice',...
div> I was recently introduced to .parquet extension files. Let me share with you what I learned! Parquet files are popular for storing and processing large datasets in big data environments. Introduction: a few words about the .parquet Parquet file is a columnar storage file format that is commonly used...
Welcome to AC Consulting, a unique platform where technology meets innovation. AC Consulting is not just a business; it's a passion project, a side hustle, and a tech playground for Arthur Camberlein.
Following my article about parquet files, I thought I could give you some examples of when you might want to prefer .csv files and when you might want to use .parquet. When could we use CSV files? Storing small to medium-sized datasets that can be easily opened and read in...
To read a .parquet file with Python, the pandas library is your friend. In fact, pandas provides a convenient way to read and write data in a variety of formats (you might be familiar with CSV or XLS[X] files), including Parquet.
Another Python and Data post today for you, by me, with the help of what I read on Twitter ... and used. Now it's time to share it with you too!
I started learning SQL in Luxembourg, but you might be surprised on how I started to use this language in the first place: in simplified Google Spreadsheet ;-).
Let me show you how to make your Streamlit App shine on the Internet!
This import will work if you are using any version of Python (meaning Python 2 or Python 3). How to import a library To import a library, you will have to use import + {the name of your library}. So you could do this to import libraries one by one:...
If you like and enjoy using Notebook for Python (Google Colab, Jupyter, ...), you will be glad that there is a tip that could help you save some time to import files from your computer (or the one from the user of your app/scrip). 2 lines to create a prompt...
How to know (and display) all columns of a DataFrame in Python Prerequisites We will once again use the (famous) pandas library used in Python. For installation, you have the solutions below: for Python pip install pandas for Python 3 pip3 install pandas You will only need to install your...
import difflib import pandas as pd from datetime import date date = date.today() today = date.strftime("%Y-%m-%d") document = today + "-diff" document_txt = "data/" + document + ".txt" document_csv = "data/" + document + ".csv" with open('robots-live.txt') as robots_live, open('robots-staging2.txt') as robots_staging: diff = difflib.unified_diff( robots_live.readlines(), robots_staging.readlines(), fromfile='robots-live.txt', tofile='robots-staging.txt', )...
Skipping last column in R while `read.csv` should be done in two steps data <- read.csv("test12.csv") data data[,-ncol(data)] That is when you don't know the number of columns. If you explicitly know, use the code below instead: df <- read.csv("test12.csv")[,-3] This solution come from a stackoverflow post I did https://stackoverflow.com/questions/48597844/skipping-last-column-in-r-with-read-csv &...