NOTE: Before we get started, you need to generate Personal Access Token to access the GitHub API. Generate one and add it to the line 8 of the code.
NOTE: You aren't allowed to commit your Access token, else it gets revoked. While this is a security feature, it also makes it difficult to debug your file locally and then push a modified version. I came across a solution for this and have written it in a gist. Do check that out!
This is a CLI to interact with the GitHub GraphQL API and the GitHub REST API (not anymore, but worth a mention). The core of the program is build upon Python, mainly using the requests and the argparse library. The request
library is used to GET
data from the API server and import it to the Python code, while the argparse
library is used to build the Command Line Interface, using arguement flags.
Feel free to jump to the Installation if you want to skip over the explanation.
GitHub REST API can be used to create calls to get the data you need to integrate with GitHub.
Trying out a basic command, we can look at the functionality of the API
$ curl https://api.github.com/Fifirex
> {
> "login": "Fifirex",
> "id": 75205458,
> "node_id": "MDQ6VXNlcjc1MjA1NDU4",
> "url": "https://api.github.com/users/Fifirex",
> ...
> }
If you add the -i
(INFO
) tag we see that the Content-Type
is application/json
. Which implies that the returned objectafter a GET
command will be of json
format, this makes it easier for us to access information due to the presence of key attributes.
Using the Access token we can access Information on all (not, technically)the Repos by running the following command for a specific username
curl -H “Authorization: token MYTOKEN” "https://api.github.com/search/repositories?q=user:MYUSERNAME"
This is all we need to know about the REST API to build our CLI.
This is the v4.0 API used by GitHub. GraphQL uses one endpoint query
rather than multiple endpoint REST uses. This reduces the number of GET
calls in the program and thus makes it faster.
I stumbled upon the APIv4.0 while looking for ways to list the topics of a Repositary (Issue #1).
To list out the topics of a particular Repo, only GraphQL can be used as it is a Preview Feature, introduced in API v4.0. So, I had to implement this in my --repo
flag, as Listing out of Topics of a Repo was asked.
I integrated the entire --repo
module with this API, so only one GET
call was required.
Over time the entire program has migrated from the RESP API to the GraphQL API. Increasing the speed and keeping the GET
calls clean.
I have also integrated the uinfo
module with GraphQL to reduce query calls over followers and following sections, which would cost 3 calls on REST API. This is the query template for the same.
More information about the migration and implementations in the Workings section.
Do try out the GitHub Explorer to visualize the nodes and the edges of the GraphQL API in real time!
To get the information from the API servers we use post
from the requests
module to run a targetted query.
The returned object is a json
type one, hence navigation is easier and faster using the key attributes provided.
The interactive interface is then built on these foundations, which was then made into an Unix Executable File to run it through the Terminal.
We use argparse
to convert the executable file into a useful Command Line Interface. To find how it works we run the following
$ FifiGit --help
usage: FifiGit [-h] [-u | -ub | -r REPO | -la {DESC,ASC} | -ls {DESC,ASC} | -lc {DESC,ASC} | -lu {DESC,ASC} | -lp {DESC,ASC}] name
GitHub API interface
positional arguments:
name Initiates the program for the given user
optional arguments:
-h, --help show this help message and exit
-u, --uinfo Display the User Info
-ub, --uinfo_basic Display the basic User Info only
-r REPO, --repo REPO Displays the Repo Info
-la {DESC,ASC}, --alpha_list {DESC,ASC}
lists all the Repos in Alpha order.
-ls {DESC,ASC}, --star_list {DESC,ASC}
lists all the Repos in Stargazers order.
-lc {DESC,ASC}, --created_list {DESC,ASC}
lists all the Repos in DOC order.
-lu {DESC,ASC}, --updated_list {DESC,ASC}
lists all the Repos in DOU order.
-lp {DESC,ASC}, --pushed_list {DESC,ASC}
lists all the Repos in DOP order.
Enjoy the program! :)
These are the possible flags for the interface:
--help
- The above help menus is shown--uinfo
- It takes the argumentname
and displays the UserInfo for the respective User.--uinfo_basic
- It takes the argumentname
and displayes only the basic information for the User (no follower/following)--repo
- It takes 2 arguments,name
andREPO
and searches for the repository by that name on the User's account. If found, it displays the basic information about it.--alpha_list
- It takes the argumentname
and displays the list all the Repositories sorted Alphabetically. [in ASC or DESC order]--star_list
- It takes the argumentname
and displays the list all the Repositories sorted by Stargazers. [in ASC or DESC order]--created_list
- It takes the argumentname
and displays the list all the Repositories sorted by Date of Creation. [in ASC or DESC order]--updated_list
- It takes the argumentname
and displays the list all the Repositories sorted by Date of Update. [in ASC or DESC order]--pushed_list
- It takes the argumentname
and displays the list all the Repositories sorted by Date of Last Push. [in ASC or DESC order]
Workings of each flag explained in this section.
These 4 arguments are grouped using the add_mutually_exclusive_group()
instance, so only one of them can be activated at a single point. (Hence the |
in the help
menu)
The help
menu is filled with the following statement
parser = argparse.ArgumentParser(description = "GitHub API interface", epilog = "Enjoy the program! :)")
The reason I chose this language over something like C++ is beacuse of the easy http connection power it possesses. GET
commands can be used in just a single line, and playing with json
objects is simpler too.
But...
GET
process in general is too slow for any language, and thus acts as the Rate Limiter Command in the program. The larger the data set (google
having over 2000 repos), the more time it takes.
The entire program was first built in the REST API format, and it used to interact with the server over multiple redundant calls using GET
.
I first implemented the GraphQL API in this commit. At the same time saving fundamentals used in the previous state were carried forward in the current module.
Going about the different tags and the general infrastructure of the code:
The fundamental of the code is to use the argparse
tags and make queries accordingly from the GitHub GraphQL API.
The advantage of GraphQL as discussed above is that it has a single endpoint, so one complex structured call can theoretically replace thousands of REST API calls and still do the more drilled job (sorting, listing, slicing, etc.).
The migration took in steadily over commits in search f absolute clean code with the best possible query time.
The query targetted for this tag is
query ($user:String!) {
user(login: $user) {
name
avatarUrl
url
email
bio
followers(first: 100) {
nodes {
login
}
totalCount
edges {
cursor
}
}
following(first: 100) {
nodes {
login
}
totalCount
edges {
cursor
}
}
}
}
This is the initial rate limited (100
) query, which generates the first 100
followers/following and the basic information the user.
We Iterate the following queries using the cursor
element to generate the complete list:
query ($user:String!, $curso:String!) {
user(login: $user) {
followers(first: 100, after: $curso) {
nodes {
login
}
edges {
cursor
}
}
}
}
And then the data is extracted from the resulting json
object. A siilar query with limited aspects is run for --uinfo_basic
.
This is a basic search operation over a given user
and Repo name
. The following query is run:
query ($user:String!, $repo:String!) {
user(login: $user) {
repositories(ownerAffiliations: OWNER) {
totalCount
}
}
repository(name: $repo, owner: $user) {
licenseInfo {
key
name
url
spdxId
}
isFork
isPrivate
forks {
totalCount
}
stargazers {
totalCount
}
watchers {
totalCount
}
repositoryTopics(first: 100) {
nodes {
topic {
name
}
}
}
}
}
This throws an Exception
if the Repo is not found, and return the Repo details if it is. Which are then displayed to the user.
This is the one tag where the utility of the GraphQL comes into picture.
There are 2 major differences here:
MAX_CAP
at the the number of Repos is removed. (it is1000
using the REST API)- there is no need of sorting or different calls for different parameter sorting.
Keeping that in mind, this is the main query used for listing:
query ($user:String!, $str:String!, $str2:String!) {
user(login: $user) {
repositories(first: 10, ownerAffiliations: OWNER, orderBy: {field: $str, direction: $str2}) {
totalCount
nodes {
name
isFork
}
edges {
cursor
}
}
}
}
where field
and direction
are the parameter you chose to sort the list and the order in which you want them.
I am running the final query page wise, and hence using the cursor
again to load the next page.
To save time over this operation, I append all the last element cursor
in a list and just use them to make the query to reload the page:
query ($user:String!, $str:String!, $str2:String!, $curs:String!) {
user(login: $user) {
repositories(first: 10, ownerAffiliations: OWNER, orderBy: {field: $str, direction: $str2}, after: $curso) {
nodes {
name
isFork
}
edges {
cursor
}
}
}
}
Once we the results after running the respective Queries, they are in a similar but, distinct format.
For example, running the above query for --list
, in a specific iteration with the given variables:
{
"user": "Fifirex"
"str": STARGAZERS
"str2": DESC
"curs": "Y3Vyc29yOnYyOpIAzhP2Lcc="
}
This is the output:
{
"data": {
"user": {
"repositories": {
"nodes": [
{
"name": "my-first-rep",
"isFork": false
},
{
"name": "github-slideshow",
"isFork": false
}
],
"edges": [
{
"cursor": "Y3Vyc29yOnYyOpIAzhP1MMU="
},
{
"cursor": "Y3Vyc29yOnYyOpIAzhMJqqM="
}
]
}
}
}
}
If I store this in a container result
, to navigate to a specific value (say, 2nd Repo name
), I need to use:
name = result["data"]["user"]["repositories"]["nodes"][1]["name"]
# name = "github-slideshow"
This is all about how the code works. I might've missed a couple of points about speed computations, Graph Tangling, pagination, etc. but those were the major issues (learnings, perhaps?) that were faced.
Let's get to the installation now!
NOTE : This is a Unix Executable File, which can be run only on Unix-like OS (Linux or MacOS). For using the file on Windows, you can extract the Python core and convert it to a .exe file, and add it to the PATH in a similar fashion.
NOTE : For API access, we need an access token (mentioned in Section 1). Generate one and add it to the line 8 in the code.
First, we need th code. Clone the repository into a Directory of your choice using
git clone https://github.com/Fifirex/KOSS-Selections-API.git
Once we have the executable file, just typing ./FifiGit
while in the Directory will run the program. To make it a true CLI, we need meddle with the $PATH
.
First, make a temporary bin
which will then be added to the PATH
$ mkdir -p ~/bin
$ cp FifiGit ~/bin
$ export PATH=$PATH":$HOME/bin"
This creates a copy of the file in the $PATH
and now you can run it by just,
$ FifiGit Fifirex -u
>
> ...
Pretty neat, eh?
To go a step further, where we can run this command from anywhere in the system and don't have to PATH
it again and again, we can change the ~/.bash_profile
to echo
the above commands everytime Terminal is run.
To do that, we edit the ~/.bash_profile
$ open -e .bash_profile
This opens the file in TextEdit, now just add the following line at the end of the file
export PATH=$PATH":$HOME/bin"
Go to the home directory and source .bash_profile
to update the changes.
$ cd ~
$ source .bash_profile
and Voilà!
now you can run the file from anywhere on the system by just typing FifiGit
with the format mentioned above on your Terminal.