Skip to main content

Compress your HTML files

Updated: 2012-02-24


This is probably what every web developer knows. A very basic performance improvement for your HTML pages. You should remove all the white spaces and line breaks before you publish your html page. You can view the effect directly in a DOM inspector and this post touches the same.

Consider this example available from the HTML5 spec (pg 21):

<!DOCTYPE html>
<html>
  <head>
    <title>Sample page</title>
  </head>
  <body>
    <h1>Sample page</h1>
    <p>This is a <a href="demo.html">simple</a> sample.</p>
    <!-- this is a comment -->
  </body>
</html>


Save it in a .html file and open in your favorite browser. Next view the DOM created by this file. You can use the DOM inspector available in most of the browsers today (launch with CTRL + SHIFT + I). You will see the below DOM tree created:

DOM Tree with white spaces

So a lot of extra text nodes are introduced in place of your white spaces and line breaks. So now lets strip the html off all the white spaces, comments and line breaks. Your html will look like this:

<!DOCTYPE html><html><head><title>Sample page</title></head><body><h1>Sample page</h1><p>This is a <a href="demo.html">simple</a> sample.</p></body></html>

Reload the page and view the DOM structure using the DOM Inspector. You will see:

DOM Tree without white spaces

Notice the difference? The DOM tree created has lesser elements. Serving the page is also faster.

Develop readable code, with all the best practices proper indentation, line breaks etc. But do strip the html page before you publish, its much smaller and faster this way. Same applies for your CSS and JS files.

Comments

  1. But again there is a trad off .What if user wants html to be in human readable form ? and i could see in most of the pages they are well formatted then y so much significance ? y they cant take care of this in html5 atleast ? :P

    ReplyDelete
  2. krish, only if you want the "view source" to show readable stuff. I would be ok with this trade off :)

    ReplyDelete
  3. Thanks Pinaki. Well Krish, if you want, you can still view the proper tree using a DOM inspector. But I feel better compress all your web files (html/js/css)

    ReplyDelete

Post a Comment

Popular posts from this blog

Minimal required code in HTML5

I've encountered this question repeatedly of late. "What are the tags required at bare minimum for a html file?" Earlier there were a bunch of mandatory tags that were required for any html file. At bare minimum, the recommended structure was: (ref: http://www.w3.org/TR/html4/struct/global.html ) <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML>   <HEAD>     <TITLE>A small HTML</TITLE>   </HEAD>   <BODY>     <P>Small HTML file!</P>   </BODY> </HTML> Yes, using capitals for the tags was the way to go! Those were the days of the purists and strict was the way to be. Now open your notepad and copy the above code, save the file as old.html and launch it in Chrome or Firefox. You will see only one line "Small HTML file!" shown. Now launch the developer tools in Chrome or Inspect Element in Firefox. Thi...

Using duplicate IDs in HTML

Well today I'm being a bit controversial. Let us see what the HTML5 spec says about unique IDs in a HTML file. The  id  attribute specifies its element's  unique identifier (ID) . The value must be unique amongst all the IDs in the element's  home subtree  and must contain at least one character. The value must not contain any  space characters . An element's  unique identifier  can be used for a variety of purposes, most notably as a way to link to specific parts of a document using fragment identifiers, as a way to target an element when scripting, and as a way to style a specific element from CSS. Yes its been mentioned almost everywhere on the planet that ID must be unique. Now let us look at the below code, Launch dup.css #p2 {   background-color: yellow;  } dup-id.html <!DOCTYPE html> <html>   <head>     <title>Duplicate ID Tester</title> <link rel="stylesh...

Fixing Date, Time and Zone on RHEL 6 command line

Had to fix all time related issues on a remote RHEL 6 server which runs without any windowing system. Plain ol' command line. Documenting steps here for future reference: Check to see if your date and timezone settings are accurate: # date # cat /etc/sysconfig/clock The server I accessed had wrong settings for both the commands. Here are the steps I used to correct: Find out your timezone from the folder /usr/share/zoneinfo # ls /usr/share/zoneinfo Mine was pointing to America/EDT instead of  Asia/Calcutta Update and save the /etc/sysconfig/clock file to # sudo vi /etc/sysconfig/clock ZONE="Asia/Calcutta" UTC=true ARC=false Remove the /etc/localtime # sudo rm /etc/localtime Create a new soft link to your time zone # cd /etc # sudo ln -s /usr/share/zoneinfo/Asia/Calcutta /etc/localtime # ls -al localtime Now it should show the link to your time zone Set your hardware clock to UTC # sudo hwclock --systohc --utc # hwclock --show Update your t...