Parsing
The parsing stage involves taking the raw HTML & CSS code received from the server and converting it into a structured document object model (DOM) that the browser can understand and render onto the screen.
HTML Parsing
Tokenization
The browser reads the characters of the HTML file sequentially and breaks them down into tokens.
Tokens are essentially the smallest units of the language's syntax, such as tags, attributes, and text content.
Syntax Analysis
Once the tokens are generated, the browser uses a parser to interpret the structure and syntax of the HTML code.
The parser organizes the tokens into a hierarchical tree structure known as the Document Object Model (DOM).
This DOM represents the logical structure of the document and contains elements, attributes, and their relationships.
Script Parsing and Execution
In addition to HTML parsing, the browser may also encounter and parse inline or external JavaScript code within
<script>
tags.Reentrant parsing refers to the ability of a parser to pause parsing at any point, handle another task or event, and then return to parsing from the exact point where it left off without losing context or state.
Whenever parser encounters
<script>
,<link>
or<style>
tags it pauses the parsing and fetches the code from network for the script and execute the script as it may alter the document.execution of the script can also be deferred or can be executed asynchronously so that parsing can occur uninterrupted
Speculative parsing
Speculative parsing, also known as speculative loading is a technique used by web browsers to improve the performance of web pages by predicting and initiating the loading and parsing of external resources (scripts, stylesheets, and images) before they are explicitly requested by the user.
Prediction
The browser analyzes the current page and makes predictions about which external resources will be needed based on factors such as the HTML structure, previous navigation patterns, and the content of the page.
Preloading
Browser initiates the loading process for these resources in the background, typically using techniques like preloading, prefetching, or prerendering.
Parsing
As the resources are being fetched, the browser may start parsing them before they are actually needed. This involves extracting relevant information from the resource, such as CSS rules or JavaScript code
Execution
the browser may execute the code speculatively if it does not have dependencies on user interactions or the current state of the page.
Caching
Speculatively loaded resources are cached and are not needed at present but if the resources are later requested, the browser may already have them, reducing latency and improving load times.
CSS parsing
It involves interpreting CSS code to determine the styles that apply to each element in the DOM.
Tokenization
The CSS parser starts by tokenizing the input CSS code. It reads the characters sequentially and breaks them down into tokens.
Syntax Analysis
It organizes the tokens into a parse tree, also known as the Abstract Syntax Tree (AST), representing the hierarchical structure of the CSS rules.
Rule validation
During parsing, the parser validates each CSS rule for syntactic correctness. It checks for errors such as missing semicolons, invalid property names, unrecognized selectors, and other syntax violations.
Selector matching
After parsing individual rules, the browser matches selectors against elements in the DOM to determine which rules apply to each element.
Cascading
Once selectors are matched to elements, the parser resolves conflicting styles using the cascade. Conflicting styles are resolved based on specificity, importance (!important), and source order.
Property Parsing
It extracts property names and values from each declaration and applies them to the matched elements in the DOM.
Computed Styles
The browser computes the final styles for each element by combining inherited styles, default styles, and the styles derived from the parsed CSS rules.
HTML parsing determines a structured representation of the document. CSS parsing determines the final presentation of elements on the web page. Both HTML and CSS parsing processes are vital for rendering web content accurately and efficiently.