Python GitHub API
When fetching information from the web, we usually request for complete web pages, and extract information by parsing the HTML scripts. Similarly, an Application Programming Interface (API) performs the same operation in a more efficient way.
This tutorial will teach you how to create a self-contained application that generates a summary based on the information it obtains through the API.
GitHub is a website where programmers can contribute to various open-source projects.
In this article, we will request information related to Python projects on GitHub using the Github API. We will also summarize information that we’ve obtained using the API.
Prerequisites
As a prerequisite, you must have a little understanding of Python to follow the tutorial along.
Objectives
In this article we will go through:
Using an API call to request data.
Installation of
requests
library.Keeping track of an API response.
Using the response dictionary.
Summing up the top repositories.
Requesting data using an API call
GitHub’s web API allows you to make API requests for a range of data.
Type the following into your web browser URL bar and press Enter to see how an API call appears like:
Let’s examine the parts of the API call:
https://api.github.com/
- sends the request to the GitHub web server that handles API calls.search/repositories
- is the endpoint that informs the API to search across all of GitHub repositories.?
- indicates that an argument is about to be passed.q=
- the characterq
stands forquery
.language:python
- that queries repositories that use only Python as their main language.&sort=stars
- the projects are sorted by the number of stars they have gotten.
Upon fetching the API data, the response will look like:
NOTE: The output above shows only the first few lines of the response.
Let’s examine the output:
In the second line of the result, you can see that GitHub has detected a total of
7668509
Python projects.We know the request was successful if the value for
incomplete results
isfalse
.The key
items
holds a list of objects that contains information of the Python-based projects on GitHub.
Let’s try to explore more information by parsing the API’s output using Python.
Installing requests
The requests
package enables us to request data from the website and evaluate the result easily using a Python program.
Run the following command to install requests
:
Visit this link, if this is your first time using pip
for installing packages.
Processing an API response
To fetch the most starred Python projects on GitHub, we’ll start writing a program that will make an API call and evaluate the data as shown:
Let’s understand the code snippet above:
We begin by importing the
requests
module.Then, we use the
requests
package to make the API call to the particularurl
usingget()
.The API response is saved by a variable called
response
.The
status_code
attribute of theresponse
object indicates if the request was complete.A successful API call returns the
status_code
200
, while an unsuccessful one returns500
.Then, we use the
json()
function to convert the information from JSON format to a Python dictionary.We store the converted JSON in
response_dict
.
Then, we print the keys from response_dict
, which are as follows:
Using the response dictionary
Now, let’s make a report that sums up all the information.
Here, we will be calculating the total number of available repositories with language as Python
, and fetch all the keys under items
as shown:
Let’s understand the code snippet above:
The value linked with the
total_count
reflects the count of GitHub Python projects available.The value of
items
is a list of dictionaries, each providing information about a single Python repository.The list of dictionaries is then saved in
repos_dicts
.We select the first item from
repos_dicts
to look more closely at the information given about each repository.Finally, we print the all of keys of an
item
.
Output:
The GitHub API gets back a range of data for every repository like:
status_code
as200
.Total number of repos as
7694326
.Total number of repos found as
30
.Each repository
repo_dict
having74
keys.
You may get a sense of the type of information you can get about a repository by observing these keys.
Let’s have a look at what some of the keys in repos dict entail:
Output:
Examining the output:
You can observe that the most popular Python repository on GitHub is
public-apis
.Owner of the repository is
public-apis
.It has been starred more than
140,000
times.Project was created on the date of
2016 March
.Project description of
public-apis
iscollective collection of open APIs
.
Summing up the top repositories
We’ll try to analyze more than one repository.
Let’s create a loop that prints specified information about each of the repositories supplied by the API call:
We print the name of each project, its owner, the number of stars it has, its GitHub URL, and the project’s description inside the loop:
Conclusion
In this tutorial, we have gone over the following:
Using an API call to request data.
Installing requests.
Processing an API response.
Using the response dictionary.
Summing up the top repositories.
You can check out the full code here.
Last updated