Generating a histogram of all commits over time
What would I use to generate a histogram of all the commits over time?
To generate a histogram of commits over time, you can use Git's built-in git log
command in combination with a tool like awk
, sort
, and uniq
. Here's an example command that you can run in your Git repository to generate a histogram of commits per month:
git log --format="%ad" --date=short | awk '{print $1}' | sort | uniq -c | awk '{print $2, $1}' | sort
This command extracts the commit dates from your Git log, formats them to extract the year and month, sorts them, counts the number of commits for each month, and finally outputs the results as a two-column histogram of year/month and commit count.
Here's an example of what the output might look like:
2022-01 6
2022-02 3
2022-03 1
2022-04 2
You can use this output to create a histogram visualization of your commits over time, for example, using a tool like gnuplot
or matplotlib
.
How to use this with gnuplot or matplotlib?
To use the data generated by git log
with gnuplot or matplotlib, you can save the output of the git log
command to a file and then load that file into gnuplot or matplotlib.
For example, to generate a histogram of the number of commits per day using gnuplot, you can use the following commands:
- Run
git log --pretty=format:"%ad" --date=short | sort | uniq -c > commits_per_day.txt
to generate a filecommits_per_day.txt
containing the number of commits per day. - Create a gnuplot script file, for example
plot_commits_per_day.gp
, with the following contents:
set xdata time
set timefmt "%Y-%m-%d"
set format x "%Y-%m-%d"
set xlabel "Date"
set ylabel "Number of Commits"
set title "Commits per Day"
plot "commits_per_day.txt" using 2:1 with histogram
This script sets the x-axis to use time, specifies the format of the dates, and creates a histogram of the data in commits_per_day.txt
.
- Run
gnuplot plot_commits_per_day.gp
to generate the histogram.
To use matplotlib instead, you can load the data from the commits_per_day.txt
file using Python's built-in file handling and then plot the data using matplotlib. Here's an example Python script:
import matplotlib.pyplot as plt
import datetime
dates = []
counts = []
with open("commits_per_day.txt") as f:
for line in f:
count, date = line.split()
dates.append(datetime.datetime.strptime(date, "%Y-%m-%d").date())
counts.append(int(count))
plt.bar(dates, counts)
plt.xlabel("Date")
plt.ylabel("Number of Commits")
plt.title("Commits per Day")
plt.show()
This script reads in the data from commits_per_day.txt
, converts the dates to datetime objects, and creates a bar chart of the data using matplotlib's bar
function.
change this to commits per week
To convert the histogram of commits per day to commits per week, you can sum the counts of commits for every 7 days. Here's an updated script that does this:
#!/bin/bash
# create a histogram of commits per day
git log --pretty=format:"%ad" --date=short | sort | uniq -c > commits_per_day.txt
# convert to commits per week
awk 'BEGIN {count=0;} {count+=$1; if (NR%7==0) {print count; count=0;}}' commits_per_day.txt > commits_per_week.txt
# plot the histogram using gnuplot
gnuplot -p -e "set term dumb; set title 'Commits per week'; plot 'commits_per_week.txt' with boxes"
In this updated script, git log
is used to extract the commit dates as before, but this time the output is piped to uniq -c
to count the number of commits per day. The resulting histogram is stored in commits_per_day.txt
.
The awk
command then reads commits_per_day.txt
and sums the counts of commits for every 7 days. The output is stored in commits_per_week.txt
.
Finally, gnuplot
is used to plot the histogram of commits per week.