Imagine you want to speak to someone at a party, but the noise and music all around won’t let you. Web sockets can secretly let you talk with the person without any interference from the noise. It is an innovative technology that provides real-time communications between a server and a user. Web sockets are used in various applications, alerts, notifications, software, remote monitoring, real-time analytics, etc. These are not the only ways you can use web sockets. They have diverse applicability in any setup or system where real-time communication is required. To understand Web sockets and appreciate their use, we must also talk about HTTP protocol, and we will do this further in the article.
For now, let us understand deeply what web sockets are, how they work, and other information you need to know about this versatile technology.
Web Sockets: What Are They?
Web sockets are a communication protocol that uses a full duplex bidirectional channel on a TCP connection between a user/client and a server. Whenever a user or a client requests data from a server, a TCP connection is established, and this connection lasts until either the server or the user closes it. Web sockets make it possible for not just a user to initiate a request but for a server to also be able to start a connection. We see a prevalent example in chatting applications or software. For instance, if you are connected to the internet, a server can send you a message without you initiating a request in the first place. This is not a perfect illustration, but as we delve deeper into the subject, you will get the complete picture of what a web socket represents.
Let’s look at a brief history of web sockets.
A Brief History of Web Sockets
Web applications became popular in the 2000s when technological innovation exploded in dynamic, responsive, and interactive end-user experiences. At this time of the expansion, it was still challenging to achieve faster real-time communication as we have it today. What was available were AJAX and Comet HTTP-based technologies that could not be optimised for real-time functionalities. This limitation led to more studies for a better alternative.
Ian Hickson and Michael Carter were some of the developers concerned with these limitations, and in 2008, they collaborated with IRC and W3C mailing lists and came up with a new modern standard for real-time communication on the internet. This is how “Web Sockets” was invented.
Web Sockets: Protocols and APIs
A web socket connection keeps running as long as both the server and client keep sending data to each other. It can run forever with minimal overhead.
The web socket application is made up of two core components:
- The Web Socket Protocol
- The Web Socket API
Web Socket Protocol
This is what we have been discussing. The web socket protocol enables bi-directional communication between a server and a web client over a TCP connection. This protocol was standardised in December 2011 by the Internet Engineering Task Force (IETF) through the RCF 6455. The Web Socket Protocol Registries are maintained by the Internet Assigned Numbers Authority (IANA). This body defines the several codes and parameters used by the Web Socket Protocol.
Web Socket API
On the other hand, the Web Socket API, contained in the HTML Living Standard, is a programming interface to establish Web socket connections between clients and servers for data exchange. This technology provides a standard way for developers to integrate web sockets in their applications and software.
Today, most new web browsers have support and functionality for web socket APIs. Also, Web socket is used in several libraries and frameworks, both for commercial solutions and open-source applications.
What is a TCP Connection?
As we would be using this term very often in this article, let’s define it.
A Transmission Control Protocol (TCP) is a communication standard where digital devices or computer applications can send and receive data or messages over a network.
Now that we’ve got that out let’s talk about WHY we need Web Sockets.
Web Sockets: Why They Are Important
As hinted previously, web sockets became necessary because of the limitations faced with the HTTP-based technology. In explaining why web sockets are essential, we have to discuss the HTTP protocol.
The HTTP protocol is a unidirectional protocol that must be initiated only by a client to a server requesting data or resources. This means the server cannot request a client or user. The HTTP protocol is a request/response-based system. Many of the transactions in the public are using the HTTP protocol.
Another way to understand this protocol is this. Let’s say you open our website https://netnut.io/. Your computer will request a backend computer server. This request your web browser is sending uses the HTTP protocol. Once a contact has been made, you get a response from the server. The request/response between your computer and the server is established using a TCP connection. When the server receives this request, the TCP is closed.
Every time you open a URL on your web browser or web app, a TCP connection is opened, and once the server receives your request, the connection is closed. If you make 20 different requests, each time, a TCP connection is opened and closed when the server receives your request.
Now, imagine the amount of time and resources consumed every time you make a request! Creating 20 TCP connections consumes a lot of time and increases the latency of web applications.
Again, in an HTTP protocol, the client says to the server: “Give me data,” TCP opens and closes when the server gets the request. But what if the server wants to send a message or a request to the client without the user requesting it? This is impossible with the HTTP protocol.
This means HTTP protocol won’t work in chat applications. Because you can have a friend send you a message even when you did not initially send them any message. Here, the HTTP protocol cannot create a TCP connection and deliver the message to you.
This is why Web Sockets are important!
Only a web socket can handle such an application where the server or the user can request data anytime.
Web Sockets: How Do They Work?
Summarily, web sockets work this way.
For example, we have a server and a user, which are, of course, computers or other forms of communication devices.
Assuming the user and the server want to communicate, they establish a TCP connection and start sharing data back and forth, all the while the TCP connection is kept open. They can keep exchanging data until one decides to stop and close the connection. This is how Web sockets work, basically.
Suppose you are chatting with a friend on your browser, and your friend sends a new message received on the server. The server can alert your computer, “Hey, you have a new message,” without polling for new messages.
So, what is polling?
What is Polling?
The polling mechanism, or simply poll, describes a situation where the user/client constantly sends messages or requests to the server after a specific fixed time interval to check if the server wants to respond while the connection is still open. There are two types of polling:
- Long Polling: Here, the TCP connection in HTTP protocol is kept open for a longer time, often more than ~300 secs.
- Short Polling: Here, the TCP connection is kept open by the client for a shorter time.
What Are Web Sockets Handshake?
A web socket protocol connection is created using Web Sockets Handshake. The web socket protocol uses a “handshake process” to set up a client-server communication where both the server and the user can request and receive data. This handshake process is usually deployed as a JavaScript API in browsers. It can also be implemented with a server-side scripting language like Python, Java, and Node.js.
Web Sockets Life Cycle
Web sockets have three main steps or life cycles:
Opening the Web Sockets Connection
This is establishing the TCP connection (the web sockets handshake). The life of a web socket begins as a standard HTTP(S) request and response. If the handshake is successful, the server and user have agreed to send requests over the established TCP connection that was created for the HTTP request protocol as a Web Socket connection. Both parties can now send and receive data, and when they agree that the connection should be closed, it is torn down.
Unlike the HTTP protocol that uses “https://,” the web sockets protocol URL begins with “ws://” or “wss://” for a secure Web socket. The other part of the URL is the same as what you find in an HTTP URL, that is, a host, a port, a path, and other query parameters.
“ws://” host [“:” port ] path [“?” query]
“wss://” host [“:” port ] path [“?” query]
The above schemes are the only way to establish web socket connections. That means if you have a URL with “ws://” or “wss://,” the server and client MUST follow the Web Socket protocol to employ the Web Socket specification.
Web Sockets are created by upgrading the HTTP request/response connection pair. A client/user that supports the Web sockets protocol and wants to communicate would send an HTTP request with the following required headers:
- Connection Upgrade
This header determines if the connection will remain open after completing the exchange. The keep-alive value ensures the connection is persistent and preserved for future requests to the same server. The header is set to Upgrade during the websocket opening handshake, signalling that the connection should be kept alive and used for non-HTTP requests.
- Upgrade: websocket
Clients use the Upgrade header to request that the server switch to any listed protocols in descending order. The websocket value signals to the server that the client wants to create a Web Socket connection.
- Sec-WebSocket-Key: q4xkc032u266gasTuKaS0w==
This Sec-WebSocket-Key is a 16-byte, base64-encoded random one-time value generated by the client side.
- Sec-WebSocket-Version: 13
This refers to the accepted version of the Web socket protocol. The 13 is the only valid version.
All these headers are assembled to form the HTTP GET request the client will send to a ws:// URL. An example is shown below:
GET ws://anything.com:8080/ HTTP/1.1
Host: localhost:8080
Connection: Upgrade
Pragma: no-cache
Cache-Control: no-cache
Upgrade: websocket
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: q4xkc032u266gasTuKaS0w==
After the client sends this connection request for a Web Socket, it will wait for the server to reply. The server’s reply must contain an HTTP 101 Switching Protocols code. This response code indicates that the server has agreed to switch to the protocol requested by the client in its Upgrade request header. Also, the response must have HTTP headers that authenticate the connection as successful and upgraded:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: cA9dggdnMPU79lukAE3W4TRnyUT=
- Connection: Upgrade
This confirms that the connection was upgraded successfully.
- Upgrade: websocket
This confirms that the web socket connection was upgraded.
- Sec-WebSocket-Accept: cA9dggdnMPU79lukAE3W4TRnyUT=
The Sec-WebSocket-Accept is an SHA-1 hashed, base64 encoded value that is generated by concatenating the user’s Sec-WebSocket-Key and the standard value 258EAFA5-E914-47DA-95CA-C5AB0DC85B11 as specified in the RCF 6455. The Sec-WebSocket-Accept and Sec-WebSocket-Key exist so that the user/client and the server know each other support Web Sockets. This is important because either party can interpret Web Socket data as an ordinary HTTP request rather than a Web Socket request since the Web Socket re-uses the HTTP connection.
Once the user confirms the server’s response, the Web Socket protocol connection is open to begin data exchange between parties.
Data Exchange over Web Sockets
Once the connection is established, messages or requests can be sent asynchronously by the user or server.
For instance, a simple message the client sends from the web browser using JavaScript can be read thus:
ws.send(“John Luke”);
By default, Web Socket messages can include any data format or content. Nowadays, modern applications use JSON to send structured data in Web Socket messages.
For instance, a chatbot can use Web Sockets to send a message like:
{“user”: ”Mat Crocker”, “content”: “I didn’t plan to answer your ridiculous question, but to be a techie”}
Closing the Web Sockets Connection
The closing frame (opcode 0x08) is sent to close a Web Socket connection. Aside from the opcode, the frame may contain other information, like the reason for closing the connection. After a party has received a closing frame, the other side must send a close frame response as well, and no additional data is sent over the TCP connection.
Difference Between Web Sockets and HTTP Protocol
Web Socket | HTTP |
Web sockets are bi-directional communication systems and can send data both ways: client to server or server to client. | They are unidirectional and can only send/receive data when the client initiates. |
Many real-time applications use web sockets. | HTTP Protocol is stateless and is only used by simple RESTful applications. |
Most modern applications use web sockets because it is faster. | HTTP connection is slower than web sockets and cannot be retained for reuse. |
When to Use Web Sockets
Web sockets can be used for various applications like
- Real-time web applications: These web applications use web sockets to show the data to the client side, which is constantly being sent by the server at the backend. In a web socket setting, data is continually being transmitted or pushed into the already open connection, which is why web sockets are faster than HTTP protocol. This means all things being equal; there is no lagging between the server and the client in sending and receiving messages.
For instance, the prices of bitcoins or other cryptocurrencies on a trading website are continuously updated as price moves and fluctuates. This is possible with the help of web sockets.
- Gaming applications and software: Web sockets are beneficial in gaming applications where data is continuously sent and received by the server. Even without refreshing the UI, the effects still show on the screen, and when the UI gets refreshed, it is done over the already established connection, keeping the game seamless.
- Chat applications: This is one of the most common applications of web sockets. Chat companies or software providers use web sockets to create a connection only once for broadcasting, publishing, exchanging, etc, among subscribers or users. The applications reuse the already established web socket connection for receiving and sending new messages and client-to-client data and file transfer.
Other ways to use web sockets include
- Live scores and traffic updates
- Frontend and backend real-time sync,
- Live location tracking functionalities in food delivery apps and urban mobility apps.
- Shared projects and multiplayer collaboration whiteboards.
- Telecommunication and teleconferencing
When Not to Use Web Sockets
Web sockets are only necessary if we want to make real-time requests that require continuous or updated streams of data transmitted over a network. When we want to request old data, or we need the data only once, then the HTTP protocol can be used.
Best Web Sockets Alternatives
- MQTT
- HTTP long polling
- Server-Sent Events
- WebRTC
- WebTransport
Web Socket Cost
Setting up a web socket connection is inherently cheap since it is designed to be efficient, lightweight, and has minimal overhead. However, setting up and managing a reliable, scalable web socket communication system in-house is time-consuming, requires excellent engineering effort, and, of course, is expensive.
Look at some of these facts:
- You need around 10.2 person-months to build an in-house average primary web socket infrastructure with limited scalability.
- To embark on a self-built web socket installation, you need between $100k and $200k for yearly maintenance.
Advantages of Web Sockets
- They enable real-time communication.
- Reduced latency
- Improved performance
- Improved responsiveness of web applications
- Web sockets are more flexible than HTTP protocols
- They are more efficient and help to reduce bandwidth and server load.
Disadvantages of Web Sockets
- Web sockets are not optimised for video and audio data streaming.
- Web sockets don’t recover connections automatically if terminated.
- Corporate networks using proxy servers do not support Web Socket connections.
- The stateful nature of web sockets makes them challenging to use with large-scale systems.
Web Sockets: Frequently Asked Questions
Are Web Sockets Scalable?
Web sockets are scalable. Top companies like Netflix, Uber, and Slack use Web Sockets for the real-time functionalities of their applications. For instance, Slack uses Web sockets for sending and receiving instant messages between chats.
Are Web Sockets Secure?
The security of web sockets depends mainly on the measures taken by the developers. The wss:// URL indicates a secure web socket connection. This means the connection is SSL/TLS encrypted and the data transmitted over such connection between the server and the client cannot be tampered with or intercepted by third parties.
Which Web Browsers Support Web Sockets?
- Mozilla Firefox (version 4 upwards)
- Google Chrome (version 4 upwards)
- Microsoft Edge (version 12 upwards)
- Safari (version 5 upwards)
- Opera (version 10.70 upwards)
- Internet Explorer (version 10 upwards)
Web Sockets: Final Thoughts
Web socket connections are persistent full-duplex bi-directional communication between a web server and a web client or user. They are a more modern means of establishing perpetual interaction between a server and a client where both ends can initiate a request.
HTTP protocol and Web sockets are built on the TCP protocol. If you are still deciding which to use, list your requirements and head over to the pros and cons section of this article to see what will work for you or not. If you are looking at running real-time applications and the cons do not matter, then you should use Web Sockets.
If you want more reading on topics like these, check out our website.