What is GraphQL?

What is GraphQL?

Let's learn about graphql basics

·

21 min read

GraphQl, the new buzzword in our ecosystem. But what is it about? Why there this hype to learn and use it? Let us travel through the history of APIs together to see, how GraphQl comes to the place it has today

For us to really understand what GraphQl is, we have to understand what APIs are, a little bit of history, a basic knowledge of Restful APIs and it's drawbacks.

Warning

If you are not interested in history, and you already know enough about the Restful APIs, then you might want to jump straight to the Learn GraphQL section.

What is an API?

API stands for Application Programming Interface. It refers to a set of tools, protocols or routines used for building applications. An API informs software on how it must interact.

Another way to understand API is to think of it as the waiter in a restaurantThere's the user (customer), and there's also the software (cook). In a restaurant, the diner's order needs to get to the cook and the result (their actual food) needs to be returned. The one who is responsible for this is the waitress (API).

In other words, APIs are all about connection and are a way to integrate data from one source into another source, such as a web service or mobile app.

What is a Restful API?

The concept of an API pre-dates even the advent of personal computing, let alone the Web, by a very long time! The principal of a well-documented set of publicly addressable "entry points" that allow an application to interact with another system has been an essential part of software development since the earliest days of utility data processing. However, the advent of distributed systems, and then the web itself, has seen the importance and utility of these same basic concepts increase dramatically. Martin Bartlett

In early 2000, Roy Fielding and his colleagues had one objective: Create a standard, so any server could talk to any other server in the world.

The abbreviation REST stands for " Representational State Transfer" and refers to a software architectural style. It is based on four basic (sometimes six) principles that describe how networked resources are defined and addressed on the web, for example in a cloud.

In a Restful API, you have URLs (resources, endpoints) to tell the server what you want, and HTTP Verbs to tell the server How you wantthe data.

The combination of these two will look like this:

<Verb> <Url>

HTTP Verbs

Verbs are predefined HTTP methods that has been standardized like this:

Method Operation Example
GET Get the information GET /api/customers GET /api/customers/:id
POST Create something POST /api/customers/:id/orders POST /api/orders
PUT Update or replace something PUT /api/customers/:id
DELETE Delete something DELETE /api/customers/:id

most common HTTP verbs

Downsides of Restful APIs

Although Restful APIs are widely used today by different consumers, there are some drawbacks when choosing Restful APIs, especially in the context of the HTTP protocol. So, let me mention some of these problems first.

#1 Not everyone can implement it correctly

Ever notice how few people call their API "RESTpure"? Instead, they call it "RESTful" or sometimes "RESTish" That's because not everyone can agree on what all the methods, payloads, and response codes really are, or in most cases, they know that their implementation is far from the standards of a real Restful API.

Consider, As an example, where we might use the 200 OK response code. Should we use it to indicate the successful update of a record, or should we use 201 Created? It seems we really should use a code like 250 Updated which actually doesn't exist.

#2 Exposing many URLs

This might not actually seem like a problem, but imagine you have an evolving application, which needs to expose some URLs to process different requests. This means that you have to expose different URLs for a different type of requests.

That being said, if you want to get a list of all blog posts, you have to send a request to, for example, GET /blog and if you want to get one blog post, then you have to call GET /blog/:id where :id is an identifier for a blog post.

Now comes the best part, if you want to get a list of all comments associated with a specific blog post, then you should call GET /blog/:id/comments. That's basically two API calls or two requests (each has its own size) to your server. So from now on, for each user you get for a blog post, you have to consider two API calls to your server, thus doubling your resource usage!

#3 Over/Under fetching

Before jumping into this, let's clarify two terms:

  • Over-fetching is when the client gets more information than what it actually needs.
  • Under-fetching is when the endpoint doesn't provide all of the required information so the client has to make multiple requests to get everything it needs.

Another common problem developers face when implementing REST is over and under fetching. This is basically because that client can only request data by hitting endpoints that return fixed data structures. Thus they aren't able to get exactly what they need and run into either an over-fetch or an under-fetch scenario.

As an example, let's say you're working on an app that needs to display a list of users with their names and birthdays. The app would hit the /users endpoint and, in turn, receive a JSON response with the user data. However, the data structure would contain more information about the users than what's required by the app. Information like the user's name, birthday, email_address and phone_number. Remember this example, I'm going to use it later.

{
  "username": "aientech",        // you need this
  "avatar": "https://...",        // you need this
  "name": "Aien Saidi",            // you don't need this
  "email" "info@aien.me"        // you don't need this
}

This is basically over-fetching. It might seem like nothing at first glance, but don't imagine them with only one user, but 1000! in that scale, there will be a big difference between 2KB of data and 5KB!

What is GraphQL?

Let's see, GraphQL is Graph Query Language. A query language is a computer language used to retrieve data from a source (mostly a database), so basically, GraphQL is kinda similar to a SQL command:

SELECT firstName, lastName FROM People;

But GraphQL (or GQL for short) is meant for APIs, so you shouldn't actually write any SQLs to get the information you want. (especially you don't want any SQL Injections in your DB)

Now here's the difference, in Rest API you call a resource for data, and if there was a need then multiple resources for different data, and then You are responsible to extract the information you need and show it to the client.

But in GraphQL, You ask for what exactly you want and the GraphQL server is responsible to deliver the data you need.

Now let's get back to our example, where we wanted to show a list of users. Here's how it's possible in both GQL and Rest:

Rest API

Where do we send our request?

GET /api/users

What do we send to the server (request):

{}

What do we get from the server (response):

[
  {
    "id": 1,
    "name": "Leanne Graham",
    "username": "Bret",
    "email": "Sincere@april.biz",
    "address": {
        "street": "Kulas Light",
        "suite": "Apt. 556",
        "city": "Gwenborough",
        "zipcode": "92998-3874",
        "geo": {
            "lat": "-37.3159",
            "lng": "81.1496"
        }
    },
    "phone": "1-770-736-8031 x56442",
    "website": "hildegard.org",
    "company": {
        "name": "Romaguera-Crona",
        "catchPhrase": "Multi-layered client-server neural-net",
        "bs": "harness real-time e-markets"
    }
  },
  ...
]

GraphQL

Where do we send our request?

POST /graphql

What do we send to the server (request):

query {
  users {
      name
    username
    website
  }
}

What do we get from the server (response):

[
  {
    "name": "Leanne Graham",
    "username": "Bret",
    "website": "hildegard.org",
  },
  ...
]

Now let's see another example, we want (1) a blog post, (2) its author's information and (3) a list of all online users in our website.

In a Restful API, we have to send 3 requests to our server as follows:

  • GET /api/blogs/:id
  • GET /api/blogs/:id/authors
  • GET /users?online=true

Since these are all GET requests we might not need to (ideally) provide anything in our request's body. Also, note that each of these requests might over or under fetch the data we need. Now let's see the same in a GraphQL API:

  • POST /graphql

Only one request to a single endpoint, but we have to tell the server what we exactly want:

query {
  blogs(id: "our-awesome-blog-post") {
    title
    intro

    authors {
      name
      avatar
    }
  }

  users(online: true) {
    username
    avatar
  }
}

AwesomeLet's dive deep into the GraphQL world!

Differences between GraphQl and Restful APIs

Here's the big picture:

Rest API GraphQL
Architecture server-driven client-driven
Operations CRUD Queries, Mutations, Subscriptions
Data Fetch many endpoints single endpoint
Learning curve shallow steep
Validation Manual Automatic

Learn GraphQL Basics

Before starting to learn the GraphQL itself, let us get familiar with some terms that are widely used in GQL ecosystem.

Information

To write this part of the post, I'm following the official GraphQL and OpenCrud documentations. Also, note that the GraphQL specifications (or standards if you're ok with it) might change later. For now, I'll use the 2020 working draft version specs.

Queries and Mutations

Do you remember our HTTP Verbs? One of the main differences between Rest and GQL is the usage of HTTP Verbs. In GraphQL, We mostly (or only) use the POST method to call our endpoint (mostly just /graphql). But how can we tell our server if want to get, create, update or delete something? That's where the Queries and Mutation appear.

You can imagine queries in GQL as GET methods in Rest APIs and mutations as POST, PUT and DELETE.

Action Rest Equivalent GQL Equivalent
Create POST Mutations
Read GET Queries
Update PUT Mutations
Delete DELETE Mutations

CRUD actions and their Rest and GQL equivalents

Fields

Fields are in the core of GraphQL, you can imagine them as functions which return values. We basically ask the server to give us the values of fields (using something called a resolver function). In the anatomy of JSON objects, we have name and values, which is field and values in GraphQL:

query {
  blogs {
      title
  }
}

The response:

{
  "data": [
    {
      "title": "Awesome Blog Post"
    },
    ...
  ]
}

In this example, the title is a field in blogs object. A field can also be an object at the same time, which then will include other fields with their values:

query {
  blogs {
    title
    authors {
      name
    }
  }
}

Arguments

Fields are conceptually functions which return values, and occasionally accept arguments which alter their behaviour. These arguments often map directly to function arguments within a GraphQL server's implementation.

query {
  user(id: 4) {
    id
    name
    profilePic(size: 100)
  }
}

Aliases

By default, the key in the response object will use the field name queried. However, you can define a different name by specifying an alias.

query {
  empireHero: hero(episode: EMPIRE) {
    name
  }
  jediHero: hero(episode: JEDI) {
    name
  }
}

and the response:

{
  "data": {
    "empireHero": {
      "name": "Luke Skywalker"
    },
    "jediHero": {
      "name": "R2-D2"
    }
  }
}

Mutations

In REST, any request might end up causing some side-effects (changes) on the server, but by convention, it's suggested that one doesn't use GET requests to modify data. GraphQL is similar, technically any query could be implemented to cause a data write. However, it's useful to establish a convention that any operations that cause write should be sent explicitly via a mutation.

mutation {
  createPost(title: "New Post") {
    title
    createdAt
  }
}

Types

The GraphQL Type system describes the capabilities of a GraphQL server and is used to determine if a query is valid. The type system also describes the input types of query variables to determine if values provided at runtime are valid.

GraphQL is a statically typed query language, meaning that each field has a particular type same as other programming languages. The basic types that are supported by GraphQL are String, Int, Float, Boolean, and ID. These are called Scalar types.

By default, every type is nullable, it's legitimate to return null as any of the scalar types. Use an exclamation point to indicate a type cannot be nullable, so String! is a non-nullable string.

To use a list type, surround the type in square brackets, so [Int] is a list of integers.

Data types in GraphQL can be categorized as:

Type Description
Scalar A basic datatype available in GraphQL. String, Int, Float, Boolean and ID are some examples.
Object GraphQL Objects represent a list of named fields, each of which yield a value of a specific type.
Interface GraphQL interfaces represent a list of named fields and their arguments. GraphQL objects can then implement these interfaces which requires that the object type will define all fields defined by those interfaces. These are in a sense, same as interfaces in object oriented languages.
Union GraphQL Unions represent an object that could be one of a list of GraphQL Object types, but provides for no guaranteed fields between those types. They also differ from interfaces in that Object types declare what interfaces they implement, but are not aware of what unions contain them.
Enum GraphQL Enum types, like scalar types, also represent leaf values in a GraphQL type system. However Enum types describe the set of possible values.
InputObject Fields may accept arguments to configure their behavior. These inputs are often scalars or enums, but they sometimes need to represent more complex values. A GraphQL Input Object defines a set of input fields, the input fields are either scalars, enums, or other input objects. This allows arguments to accept arbitrarily complex structs.
Lists A GraphQL list is a special collection type which declares the type of each item in the List
Non-Null By default, all types in GraphQL are nullable; the null value is a valid response for all of the above types. To declare a type that disallows null, the GraphQL Non?Null type can be used. A non-null field can be represented by using a ! sign appended to the type's name. (e.g String!)
Directives A GraphQL schema describes directives which are used to annotate various parts of a GraphQL document as an indicator that they should be evaluated differently by a validator, executor, or client tool such as a code generator. To define a directive, one can simply use a @ sign at the end of field's definition. (e.g @skip)

A GraphQL server architecture

Now that we've learned a bit of GQL's terminology, we need to know how does a GQL server look like? Does it matter if we create our GQL server using NodeJS? or should it be written in Golang?

No! it really doesn't matter in which language your GQL server is written, as long as it responds to your business needs. But one thing you must consider when writing your own GQL server is the correctness of your implementation, or in other words, your design architecture.

A basic architecture would look like this: General GraphQL architecture

But how do we gain this flexibility with GraphQL? How is it that it's such a great fit for almost every different kinds of use cases?

As we already learned, the payload of a GraphQL query (or mutation) consists of a set of fields. In the GraphQL server implementation, each of these fields actually corresponds to exactly one function that's called a resolver. The sole purpose of a resolver function or method is to fetch the data for its field.

So imagine the following GQL query:

query {
 users {
  name
 }
}

the field name will probably have the following resolver:

import orm from "my-very-awesome-orm-example";

function getName() {
 return orm.select("name").from("users");
}

When the GQL server receives a query, it will call all the functions for the fields that are specified in the query's payload. Thus resolves the query and is able to retrieve the correct data for each field. Once all resolvers returned, the server will package data up in the format that was described by the query and send it back to the client.

Learn by code!

Ok, I think we've had enough theory for our start. Let us build an example graphql server with our beloved language, JavaScript and the ExpressJS framework.

Folder structure and initialization

We will start our work by creating the main folder for our project, then initializing an npm package inside it. Follow me in this one and enter these codes into your terminal (or do them by hand, however you prefer)

By the way, in case you are not a terminal person, the $ sign means that you DON'T need administration rights to do the command. (Administration rights are mostly shown with # sign). I also assume that you have visual studio code installed on your system.

$ mkdir my-awesome-graphql-server
$ cd my-awesome-graphql-server
$ npm init
# fill the questions and then:
$ npm i -S graphql express express-graphql
$ code . && exit

First server

Now that the Project setup is ready, let's create the first server implementation by inserting the following JS code in index.js: I'll try to explain the setup right after you copied the code

var express = require('express');
var gql = require('express-graphql');
var { buildSchema } = require('graphql');

// our sample QraphQl schema
var schema = buildSchema(`
    type Query {
        message: String
    }
`);

// our root resolver
var root = {
    message: () => 'Hello World!'
};

// define the express server and the GQL endpoint
var app = express();

// graphql middleware
app.use('/graphql', gql({
    schema: schema,
    rootValue: root,
    graphiql: true
}));

app.listen(3000, () => console.log('server running on http://localhost:3000'));

Now let's discuss about what is so far happening,

  • Lines 1-3, we have basically imported the libraries we need to set up our GQL server.
  • Lines 6-10, this is one the most important parts of defining a gal schema! Before creating a server, we have to first define how our graphql would look like, which types are used and so on. This is basically possible by defining a complete schema of our gql.
  • Lines 13-15, we define a root resolver, which will then contain all of our resolvers. in this case, message will map to a function which returns a simple Hello World! message.
  • Line 18, just defined our application a basic expressjs app.
  • Lines 21-25, this is the middleware which the express-graphql gives us to use with the express app. We gave it our schema and root resolver. There's also a graphiql option. GraphiQL is basically a simple GraphQL IDE instance for your server.
  • Line 27, we start our app on port 3000.

Now start your server by using node index.js in the root path of your project and then just open http://localhost:3000/graphql, you'll see the GraphiQL IDE ready for you.

GraphiQL IDE provides you with a auto-complete feature. So whenever you update your schema, the GraphiQl will give you the edited version of it. This is possible by introspection system made in graphql server. The auto-complete feature will be activated by pressing Ctrl + Space keys.

Now let's query the message:

As you can see, the message query maps to the message function we wrote before in our root resolvers.

A more sophisticated example

Let us redefine our schema to be like this:

var schema = buildSchema(`
    type Query {
        user(id: Int!): User
        users(company: String): [User]
    },
    type User {
        id: Int
        name: String
        username: String
        email: String
        address: Address
        phone: String
        website: String
        company: Company
    },
    type Address {
        street: String
        suite: String
        city: String
        zipcode: String
        geo: Geo
    },
    type Geo {
        lat: String
        lng: String
    },
    type Company {
        name: String
        catchPhrase: String
        bs: String
    }
`);

Then we need some test data. You can easily open https://jsonplaceholder.typicode.com/users to get a list of users created for testing purposes. Then:

var testData = [
    // paste the content of https://jsonplaceholder.typicode.com/users here
]

var schema = ... // and so on

After putting our test data there, we have to change our root resolvers like this:

var root = {
    user: (args) => {
        var id = args.id;
        return testData.filter(user => {
            return user.id == id;
        })[0];
    },
    users: (args) => {
        if (args.company) {
            var company = args.company;
            return testData.filter(user => user.company.name === company);
        } else {
            return testData;
        }
    }
}

now restart your server and open the GraphiQl IDE

This a more complex example. As you can see, in a graphql query, you can easily define what you want and how you want it.

Now, let us edit the users and do some real crud operations on them. We will therefore add three other resolvers to the root resolver as follows:

var schema = buildSchema(`
    schema {
        query: Query
        mutation: Mutation
    },
    type Mutation {
        createUser(input: UserInputType!): User
        updateUser(id: Int!, input: UserInputType!): User
        deleteUser(id: Int!): Boolean
    },
    input UserInputType {
        name: String
        username: String
    },
    type Query {
...

Here, for the sake of our tutorial, I only defined two fields of the User, name and username. But in a real-world situation, you would want to define the other remaining fields too. And also, pay attention to the company and address.

Watch out!

Note that there's a new schema definition on top of the query. The reason for defining it is that we need to tell our gql server we have both queries and mutations.

Now let's continue to finish it:

var root = {
    user: (args) => {
        var id = args.id;
        return testData.filter(user => {
            return user.id == id;
        })[0];
    },
    users: (args) => {
        if (args.company) {
            var company = args.company;
            return testData.filter(user => user.company.name === company);
        } else {
            return testData;
        }
    },
    createUser: (args) => {
        testData.concat({
            name: args.input.name,
            username: args.input.username
        });

        return args.input;
    },
    updateUser: (args) => {},  // the update script
    deleteUser: (args) => {}   // the delete script
};

As you can see, we added three new resolvers createUser, updateUser and deleteUser. I also didn't want to make the learn steps more complex by creating a real-world graphql server. But I'll tell you, in the end, a possible way of architecting your graphql server.

As you have seen, GraphQL is a new technology that is really powerful. It gives you real power to build better and well-designed APIs. That's why I recommend you start to learn it more deeply now. For me, it eventually replaced REST.

How do I architect my GraphQl server?

I come from a world of MVC architecture, that's why I find the idea of controllers a good thing in the whole design system. What I personally would do when designing an architecture for my gql server, is that I first try to separate components by their functionality. In general:

Now let me explain from bottom to top. The database is kinda clear, it really doesn't matter, for the architecture, which kind of database you are using. The important thing is how you are going to use it. That's where the Repositories come handy.

The idea of having the repositories is simple, A repository is a data access layer that defines a generic representation of a data store. So basically it's kind of an abstraction layer on top of our database, which we use to fetch or manipulate our data. You can also think of it as a place where we hold all of our SQLs in separate functions, so that we can use them later.

An example of a repository function could be:

function findOne(from, id) {
  // assuming this function returns an object
  return database.select("*").from(from).where(`id = ${id}`);
}

function findLast(from) {
  // assuming this function returns an object
  return database.select("*").from(from).orderBy("id DESC").limit(1);
}

function findFirst(from) {
  // assuming this function returns an object
  return database.select("*").from(from).orderBy("id ASC").limit(1);
}

Now the next layers are our Controllers. A Controller act as a mediator between resolver functions and repositories. it is responsible to control the data transmission between the RF and the Repo. The controller layer is helpful to select the most appropriate data and delivers it to the resolver.

In general, the lower levels are agnostic of the upper layers, that means from a lower level, you cannot access the top one. Also, a level has access to only one level down. That being said you should not access the database from a controller.

A controller is made of different functions. Each function accepts arguments and return only one result which is then used by a resolver function. Let's make an example out of it:

function getSingleProduct(id) {
  let product = repository.findOne("product", id);

  if product === null {
    throw "product does not exist"
  }

  return product;
}

function getLastProduct() {
  let product = repository.findLast("product");

  if product === null {
    throw "product does not exist"
  }

  return product;
}

function getFirstProduct() {
  let product = repository.findFirst("product");

  if product === null {
    throw "product does not exist"
  }

  return product;
}

As you might guess, a controller function can also call many different repository functions. But let's go to the next level.

From what we already learnt, A resolver is a function that resolves a value for a type or field in a schema. Resolvers can return objects or scalars like Strings, Numbers, Booleans, etc. My opinion is that a resolver should not do complex stuff. It only needs to manage the arguments coming from the root resolver and pass them to the right controllers, and the return the correct answer to the root resolver.

Therefore a resolver function has access to many controllers and tries to create the right answer for the user, through the controllers. So basically:

function product(args) {
  switch (true) {
    case args.first && args.last:
      throw "cannot get first and last product at the same time"
    case (args.first || args.last) && args.id > -1:
      throw "cannot get an specific product when first or last are defined"
  }

  if (args.first) return controller.getFirstProduct();
  if (args.last) return controller.getLastProduct();
  if (args.id > -1) return controller.getSingleProduct(args.id);

  throw "please define one of 'first', 'last' or 'id'"
}

So far so good. The resolver gets the correct argument and return a result. Now let us go to the root resolver and see how or schema looks like:

var schema = buildSchema(`
  type Query {
    product(id: Int = -1, first: Blooean = false, last: Boolean = false): Product
  }
`);

We have implemented a basic graphql server. Although this is NOT the best implementation of a graphql server, but it helps to present the idea. It also has an advantage of separating the concerns. Though you can make the complexity less, by removing the controller layer, thus speeding up your development process.

Folder structure might at the end look something like this:

Conclusion

GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more.

We have now learned how to implement our own GraphQL server with Node.js and Express. By using the Express middleware express-graphql setting up a GraphQL server is really easy and is requiring only a few lines of code. If you'd like to dive deeper into the GraphQL language also check out http://graphql.org/learn.

Also, don't forget to take a look at https://opencrud.org

Did you find this article valuable?

Support Aien Saidi by becoming a sponsor. Any amount is appreciated!