The HTML And The Web Pages

04 Aug 2022

In my earlier posts, The Static Fundamentals and The Dynamic Web, I made a overall introduction to the field of web development but in this post, we'll take a foundational but deeper look at HTML and web pages. So are you ready for the ride? Here we go! 🚀

To start off, let me say it straight- HTML stands for HyperText Markup Langauge, and it's not a programming language. (Oops... debate triggered 🤦.) We should, first of all, understand what exactly a markup language is.

A markup language is a way of encoding a document that, along with the text, incorporates labels or marks that contain additional information about the structure of the text or its presentation.

Simply put, a markup language is used to present a certain piece of data in a specific structure; may it be text, images, tables, etc. It can't be used to program any task, as opposed to what a programming language must be capable of. HTML, XML, SVG are all examples of a markup language.

Now getting back to HTML, the text you read on webpages isn't just plain text but a whole lot of information about that text is contained within itself, which tells about the structure of the text, and how it shall be presented. Consider the following sentence as an example-
My Name is Sapinder Singh. I'm a content creator.

The HTML code for the sentence above is-

<b>My Name is <i>Sapinder Singh</i>. I'm a <i>content creator</i>.</b>

As you can see, the tags <b>(for Bold) and <i>(for Italic) have their own way of presenting the data that they hold. Consider another example-
"I use VS Code for coding."

This text contains a link to another web page that you can navigate to by clicking on the highlighted text. And, this is basically what the "Hypertext" in HTML means. 🫢 Hypertext is any text that contains reference to some other text, which can be accessed by interacting with the Hyperlink (which in this case is "VS Code"). So essentially, that's how we jump from one webpage to another.

Link to this headingThe Syntactical Overview Of HTML

HTML is the building block of a webpage. It identifies different types of text and media with different tags, where each tag is in a pair of opening and closing tag. For example, if we want to write a paragraph of text, we would open the p tag as in <p>; insert the text; and then close the tag as in </p>. Then we have the attributes for each tag.

An attribute is just a property of a tag that holds a certain type of value. We define an attribute in the form name="value"; for example, look at the following snippet that defines a hyperlink named "VS Code". The a (anchor) tag is used define a hyperlink, and it requires the attribute href that should point to a URL.

<a href="https://code.visualstudio.com">VS Code</a>

Now, there are some tags that don't need to be closed, and these are called self-closing tags. We don't provide any value but just the attributes to these tags. The most basic example of a self-closing tag is <!Doctype html> that all HTML5 documents must begin with. It tells browsers to follow the most relevant specifications for rendering the page. You can find a brief & really nice explanation of Doctype on this page by MDN. Nonetheless, everything else in an HTML document goes inside the html tag; which is why it is called the root element of a webpage. This html tag comprises of two parts- head and body.

<!DOCTYPE html>
<html>
  <head></head>
  <body></body>
</html>

Link to this headinghead

The head of a webpage is not directly visible to the user because it contains information not to be read by the user but the browser instead. It contains information such as:

title of the page that is used in browser's title bar and for search engine optimizations.
styles for styling the web page. These styles are written in CSS.
scripts for adding executable code to the web page. The sole purpose of scripts is to handle user interaction on web pages.
other meta tags for additional information such as character set, favicons, and other types of metadata.

Here's an example of what we usually see inside the head tag-

<head>
  <!-- defines the character set of the web page -->
  <meta name="charset" content="UTF-8">

  <!-- defines the title of the web page -->
  <title>The Theory Of HTML</title>

  <!-- links the styles.css file to the web page -->
  <link href="./styles.css" rel="stylesheet" type="text/css">

  <!-- links the fonts from Google Fonts -->
  <link 
    href="https://fonts.googleapis.com/css2?family=Poppins:ital,wght@0,400;0,600;0,700;0,800;1,600&display=swap" 
    rel="stylesheet preload">

  <!-- links the script.js file to the web page -->
  <script src="./script.js"></script>
</head>

Link to this headingbody

The body tag represents the visible portion of a web page. Anything that you read, see or interact with on a web page is placed inside the body of that page. Each type of content is represented by a specific tag; some of the common tags are:

p - represents a paragraph of text.
img - represents an image. It requires the src attribute that should contain the path to the image.
section - represents a section of a page.
h1 to h6 - represent the Heading-level for a section. The smallest acceptable heading level is <h6>.
button - represents a button.
form - represents a form for submitting some information.
input - represents an input field. It can accept different types of input depending the value of the type attribute.

You can read more about the different kinds of HTML elements on MDN's HTML reference page.

Link to this headingHow A Web Page Is Rendered

Alright, now you know how we write HTML; but you must also know how a web page is actually rendered along with the sources that are linked to the HTML file. So, as soon as the HTML file is received by the client's browser, it starts creating what is called Document Object Model. The DOM is an inverted-tree like object model, the sole purpose of which is to represent the HTML document in memory and make it understandable to the browser's JavaScript engine. The root of this tree is the html element, which branches out into head and body, which further branch out into their children, and so on and so forth.

The DOM represents a document with a logical tree. Each branch of the tree ends in a node, and each node contains objects. DOM methods allow programmatic access to the tree. With them, you can change the document's structure, style, or content. MDN

While building the DOM tree, the browser encounters several linked resources that were defined in the head of the document. Considering the snippet I provided in the head section, here's what happens next:

When it encounters the link for ./styles.css file, it pauses parsing the document; requests the the file from the server; parses the styles; and resumes the process.
Then it encounters a link for fonts; again, it requests the fonts first; parses them; and resumes the process.
Finally, it encounters the script tag that it downloads and executes depending on the attributes provided. (Read more here)

Once the browser finishes parsing the document and styles, it combines them into a Render Tree, and it figures out the layout of web page. Then it finally paints the document, and we're able to see the web page in the browser window. All of this process is known as the Critical Rendering Path. If you want to learn more about it, you can read this guide by MDN.

Link to this headingWrapping Up

Alright, I hope you found this guide helpful to deepen your knowledge about HTML and web pages! If you liked it, feel free to explore more content like this on my blog!