2. The basic SGML document

The basic SGML document consists of a DTD or Document Type Declaration, one of several top level elements (otherwise known as tags or markups), paragraphs and text. The top level element should be a <book>, <chapter>, <article>, or <sect1>, depending on the type of document you are writing. We will be using <article> for our documents. Here is an example of a simple SGML document:

<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V3.1//EN">

<article>
   <sect1 id="introduction"><title>Hello world introduction</title>

      <para>
      Hello world!
      </para>

   </sect1>
</article>

For the rest of the tutorial I will use element, tag and markup interchangeably; they are synonymous. The first line is the DTD or Document Type Declaration. Notice that the first and last tags are both <article> and </article> tags. All other markup will be "contained" by those two tags. Notice the <sect1> tag. It has an attribute called "id". Don't worry about attributes for now; just know that all <sectX> tags where X is a number between 1 and 5 must have an "id" attribute if you want automatic hyperlinks created for HTML documents when you run the SGML parser on the file. Also, every <sectX> tag requires at least the paragraph tags and and ending </sectX> tag.

Also notice that the DTD declares the document as an article, which allows you to use the <article> element as the top-level tag in the first place. If, for example, it said 'book' instead of 'article', the SGML parser would fail to render the document.

2.1. General structure of SGML

A good doc writer always makes sure he puts a license in the SGML file. So here is what our new SGML file looks like:

<!--
Copyright (c)  2001  your name, NewbieDoc project;
http://sourceforge.net/projects/newbiedoc
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
Version 1.1 or any later version published by the Free Software
Foundation; with no Invariant Sections, with no Front-Cover
Texts, and with no Back-Cover Texts. A copy of the license can
be found at http://www.fsf.org/copyleft/fdl.html.
-->

<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V3.1//EN">

<article id="hello-world">
   <sect1 id="introduction"><title>Hello world introduction</title>

      <para>
      Hello world!
      </para>

   </sect1>
</article>

Notice how we commented out the license using the <!-- and the -->. This is important; if you forget this you will get all kinds of errors when you run the file through the SGML parser. This information will not be viewable once you build it. The reason it is not viewable is the parser thinks it's just a comment (and it is!) so it just drops it out of the final parsed document. This is nice because we can add comments in our files to remind ourselves to do things later on. This presents us with a problem. There is no viewable license in our final document. That is fine. We will get to that later.

Also, notice that I added an "id" attribute to the article tag. This id is what the HTML file will be named after the document is parsed. To really understand this, do this as an exercise. Either type, or cut and paste the above code into the text editor of your choice. Save the file as my1st.sgml. Go to the command line and change to the directory the my1st.sgml is in. Execute this command:

      bash$ sgmltools -b html my1st.sgml
      

Now look in that directory. You should see a new directory called my1st. If you change into that directory and look at it's contents you will see a file called hello-world.html. When you built this document the parser used the article's "id" attribute to create the name of the output file. The actual name of your SGML file is used to create the subdirectory name. This isn't that important now because we only have one html file, however, once you start adding multiple sections, you will see many more files in that subdirectory. If you fail to include an id tag for <article>, the HTML file will be arbitrarily named as something like "t1.html", which is no help at all.

Congratulations! You just built your first SGML file into another file format. Now open up your favorite html browser and look at your finished work.

Throughout the remaining sections I will show you snippets of docbook SGML code which you can insert into this same document (my1st.sgml). You should do that and experiment until you get the hang of it. Try pasting several things into it and then build it with the parser. Refresh you html browser and notice the changes. Then add a few more, build, refresh your browser. Repeat it as many times as necessary till you are comfortable. Then start adding content and create your own documents!