|
|
How to Read an Industrial Quick Search Site Statistics
Page
The following table summarizes the meaning of all terms in the
statistics report which are not self-explaining:
Term |
Meaning |
User
Sessions |
Similar to unique sites,
this is the number of unique hosts accessing the server during
a given time-window. This time-window is one day by default for
backward compatibility, but it can be changed with the option -u or
the Session directive in the configuration
file. For example, if the time-window is two hours, all accesses
from a certain host in less than 2 hours after the first access
from this host are lumped together into one session. All following
accesses more than 2 hours apart from the first access will be
counted as a new session. This way you may get an estimated number
of how many sessions are started on different sites to access
your server. |
Hits |
A hit is any response from
the server on behalf of a request sent from a browser. This includes
any response from the server, not only text files or documents.
If, for example, a HTML page has two images embedded, the server
generates three hits if this page is requested: one hit for the
HTML page itself and two hits for the two inline images. |
Files |
If the user requests a document
and the server successfully sends back a file for this request,
this is counted as a Code 200 (OK) response. Any such
response is counted for as a file. Again, "file" here
means any kind of a file. |
KBytes saved by cache |
The amount of data saved by various
caching mechanisms such as in proxy servers or in browsers. This
value is computed by multiplying the number of Code 304 (Not
Modified) requests per file with the size of the corresponding
file. Note: Because http-analyze can
determine the size of a file only if the file has been requested
at least once in the same summary period, the values for KBytes
saved by cache and KBytes requested are just approximations
of the real values. |
KBytes transferred |
This is the amount of data sent
during the whole summary period as reported by the server. Note
that some servers log the size of a document instead of the actual
number of bytes transferred. While in most cases this is the
same, if a user interrupts the transmission by pressing the browser's
stop button before the page has been received completely, some
servers (for example all Netscape web servers) do not log the
amount of data transferred but the amount of data which would
have been transferred if the user would have completely loaded
the page. |
|
It is important to pay close attention to the top
referring URLs versus the top referring Sites. URL's are the "specific
defined pages" of a specific site".
For instance, Yahoo! is defined as a site. Within
Yahoo!, every page where your company is found will have a specific
URL. So, if a user searches your company by name, the page that
provides the result will be one specific URL. If a user searches
your company by product, the page that provided the result will
be a different specific URL; and if a user searches your company
by a different product search, the page that provides that result
will be yet a different specific URL. By focusing on URL's, you
can separate the users that are searching by company name, and
concentrate on the group that searches by different product categories
that will help focus on what URLs are providing you the traffic
and will provide you with the foundation for your advertising decision.
Also, it is important to understand the "no
referrer" statistic (usually top ranked). These are users
(or visits) that in many cases already have a close working relationship
with your company. These statistics normally result from a user:
* Having your site book marked
* Having your site in their dropdown address bar
* Having set their browser home page to your site
* Having entered your website into their address bar
* Having clicked on links contained in any document (usually e-mail)
In many cases, since these users are usually closely
associated with your company, they should be discounted when evaluating
prospects coming from your product search URLs.
You will see that there are many visits to your home
page and your company's other URLs which are generated from the
user visiting multiple pages of your site. A larger amount of these
visits are derived from the user traveling through your Website
since each specific page visited is counted. Also, the users that
come in through the "non-referring site" or through another
URL such as a job page are counted again if these users go to the
home page and that is why your index page has so many visits. It
is important to focus on the other referring URLs that represent
new flows of traffic into your site.
|
The web server is a program running on a networked machine, waiting
for connections from the outside world to serve certain documents
on behalf of a request by a browser.
To communicate, the server and the browser use an asynchronuous
communication method called the HTTP (hypertext transaction) protocol.
It works as follows:
|
|
1. the user starts the browser and types
in an URL
2. the browser connects to the given host and requests the specified
document.
|
|

|
|
3. The web server handles the request and sends out
a response: |
|
|
|
a. if this document exists, the web server
delivers it,
b. if it does not exist or if access is not permitted, the web
server sends back an error message instead.
|
The document delivered as an answer to this
request may contain inline objects. Inline objects are simply
URLs pointing to another resource, either a document, an image, an
applet, a video/audio stream, or any other addressable HTML object.
The browser then requests all inline objects of the current page from
the
server using the steps 2 and 3 above, before it can display the content
of that page. |
|

|
This communication method is called asynchronuous,
because the browser sends out many requests for inline documents
at once (without waiting for a response from the server before sending
the next request) using different communication channels:
Since the browser's requests are often handled by different server
processes or different threads of a server process, there is absolutely
no relationship between the logfile entries caused by the responses
from the server due to a request of a document and it's inline objects. |
|

|
For example, the order in which the server
logs the successful transmission of the document itslef and the inline
images contained therein is not predictable and depends on the type
of documents, objects, server speed, system and network load, and many
other parameters. |
|
|