Search This Blog

2015-11-22

Installing and Configuring Apache Web Server

Installing


yum install httpd


Installation failures are logged:

/var/log/httpd/error_log


SELinux can also cause problems

Installing documentation (restart of apache is required for docs to be distributed on http://localhost/manual):
yum install httpd-manual

Basic Apache Configuration

Apache core modules
  • core
  • prefork
  • http_core
  • mod_so

Other modules can be loaded using the LoadModule directive from /usr/lib64/httpd/modules

Configuration File

Located at /etc/httpd/conf/httpd.conf

Listen

Apache listens on a specific port as given by the listen directive

Listen 80

The above will tell apache to listen on any interface on Port 80

The below will tell apache to only listen on a specific IP on port 80

Listen x.x.x.x:80

Document Root

Defines top level directory of the we site

DocumentRoot "/var/www/html"

The above will define where your files for hosting are hosted in your default apache web server.

So for example we would put a page called "index.html" under /var/www/html/ and then access that page via http://localhost/index.html

Process Run Identities

Apache installs an apache user and group upon installation of httpd and configures apache to run as that user and group by default using the following directive

User apache
Group apache

Apache Modules

Modules are loaded for additional functionality using

LoadModule module/name/path.so

Multi-process Httpd handling

There are spare processes waiting to be run and these processes are controlled in the configuration file via

StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 256
MaxClients 256

These should be configured with knowledge of your architecture of your devices

Containers

Containers are basically used to segment directives to specific directories or sites. They're a way of localizing your config so that global settings do not affect a specific site

Example

<Directory "/var/www/cgi-bin">
AllowOverride None
Options None
Order allow,deny
Allow from all
</Directory>


There are other containers such as "Location" and "VirtualHost" which configuration parameters are maintained in.

DirectoryIndex

Used to specify which document to pick up if you only give it the folder and not the file

DirectoryIndex index.html index.html.var



Configuring Virtual Hosting

Serving multiple sites at the same IP address, but with different host names
OR
Serving multiple sites on different IP addresses

Example: One instance of apache serving 3 different branches for the same company


Name-Based Virtual Hosting

All of the sites resolve to the same IP address you just use sub-domain names in order to specify which site

east.example.org
west.example.org

both resolve toe the same IP address the only difference being that the apache configuration file for east and west are different

Virtual host for "East" will specify a DocumentRoot /var/www/html/east
Virtual host for "West" will specify a DocumentRoot /var/www/html/west

Apache relies on a "Host" field in the http request header to know which site to serve
The host is generally east.example.com

Defining a Virtual Host


# Enables virtual hosting for requests arriving on port 80
NameVirtualHost *:80

# Requests that do not match any virtual hosts are served from the first one
<VirtualHost *:80>
ServerName east.example.org
DocumentRoot /var/www/html/east
</VirtualHost>
<VirtualHost *:80>
ServerName west.example.org
DocumentRoot /var/www/html/west
</VirtualHost>


You can also specify separate logs and parameters for each site as well
<VirtualHost *:80>
ServerName east.example.org
DocumentRoot /var/www/html/east
ErrorLog /var/log/httpd/east/error_log
TransferLog /var/log/httpd/east/access_log
</VirtualHost>

IP-Based Virtual Hosting

This is where each request that matches a specific host are resolved to a separate IP address and are handled based on the different IP addresses.

Basically, apache decides which content to serve based on the IP address in which the request was received.
east.example.com -> x.x.x.x -> /var/www/html/east
west.example.com -> x.x.x.y -> /var/www/html/west

Typically this is done via setting aliases on interfaces or using separate interfaces with different IPs created.

Other Methods of Serving Sites

Run Multiple Instances of apache running on different ports and separate configs
Use multiple VMs, each with its own IP and port space.

Apache Access Control

3 scenarios
Fully Public
Member restricted
employee restricted



Security is generally configured using the "Directory" tag which specifies access to specific files. There are a bunch of AuthParameters which tell you how users will be authenticating, and specific directives.

Access Control based on IP


<Directory /var/www/html/east/admin>
order deny,allow
deny from all
allow from 192.168.1.0/24
<Directory>


You can specify access via CIDR, IP address, full or partial domain name

you can also use .htaccess files to store configuration for accessibility with the specific content folders in which they apply

For example:
You could use an .htaccess file placed in /var/www/html/east/admin/.htaccess in order to specify auth info for admins of east

If they exist at several levels their content "accumulates" or "inherits" as apache follows the path down to the file it's serving.


The main benefit of .htaccess files are that non-root users that control their own content can be responsible for their own permissions

.htaccess slows performance slightly

Apache Logging

Common log format is an industry standard for logging http logs


This defines how apache will write its log and is typically default

LogFormat "%h %l %u %t \"%r\" %>s %b" common



common - is the nickname for the format

More logging parameters


Custom Formatting of Logs


LogFormat "%h %t ... ${Referer}i" myformat
CustomLog logs/referer_log myformat


you can see that the myformat is used to reference which logformat you want to use

Log File Analysis

Useful for:
Which parts of the site are most popular
Which users are accessing sites

Number of visits
Number of different visitors
Duraction of visit
Entry page
Most viewed pages
Exist page
Domain or contry of the visitor
Users browser and OS
Busiest time of the day

Some Open Source analysis tools:
Analog
AWStats
Webaliser

Proprietary Tools:
sawmill
splunk

Error Logs



Apache Handlers

Provides internal status of pages
server-status is a common one


server-info
merges settings from all configuration files to show you the final config and modules loaded

Logging and Status Reporting


No comments:

Post a Comment