|
|
The mod_rewrite module of the Apache webserver can be used to translate a requested URL to a different one. This is useful for example if a page has been moved to a new location, but you still want the old URL to work. If the webserver has been configured to allow it, this facility is available to users on a per-directory basis. You create a file named .htaccess in the directory in which you want URL translations to apply and put some directives in the file that specify how the translation should be done. As an exercise for myself in writing .htaccess files, I've implemented a simplified URL scheme for read-only access to Grex conferences, items, and responses. It works as follows: http://jremmers.org/grex/bbs -list of all conferences http://jremmers.org/grex/bbs/CONF -index of conference CONF http://jremmers.org/grex/bbs/CONF/ITEM -content of an item http://jremmers.org/grex/bbs/CONF/ITEM/SEL -selected part of an item Examples: http://jremmers.org/grex/bbs/kitchen -index of kitchen cf. http://jremmers.org/grex/bbs/web/5 -item 5 of web cf. http://jremmers.org/grex/bbs/web/5/2 -resp 2 of item 5 of web cf. http://jremmers.org/grex/bbs/web/5/1-4 -resps 1-4 of that item Note that even though the domain given in the URLs is my website, no Grex conference content is actually stored there. Feel free to play around with this. I'll explain how I did it in a subsequent response.
11 responses total.
What does that do to web logs? Is the rewritten URL logged as referrer?
Hm, dunno. Anybody?
I did some reading up on and experimentation with the HTTP referer header. Typically it's sent by a browser when you follow a link; its value is the URL of the page on which the link occurs. It's a reverse link from the target of the original link back to the source. If you're reading this in Backtalk, you can see the value of the referer header by clicking on this link: http://c2.com/cgi/test/ You'll get a display of the list of HTTP headers that your browser sent to the server at c2.com. Unless your browser is configured not to send referer headers, one of the headers will be HTTP_REFERER; its value is the URL of the Backtalk page on which the link occurs. On the other hand, if you go to a URL by simply typing it into your browser address window, your browser shouldn't send a referer header. You can try this out with c2.com too. I think all that is completely independent of any rewriting that mod_rewrite does, though, since the referer header is sent by the browser before any rewriting on the server takes place. (My starting point for the above was looking at the "HTTP referer" article in Wikipedia (http://en.wikipedia.org/wiki/HTTP_Referer). The article points out that the correct spelling is "referrer" and that whoever made up the HTTP header name misspelled it.)
Hm... In testing the link in resp:3, it seems that no referer header is sent, at least by my browser (Safari). However, I made a page http://grex.org/~remmers/referer.html that links to c2.com/cgi/test/; when I click on *that* link, Safari sends the expected referer header. Same results with Firefox. Not sure what's going on. If I click on the link in resp:3 from my usual Backtalk interface (pistachio), no referer is sent. However, it is from the vanilla interface.
John, I'm reading, even if it's mostly over my head. Thank you for musing outloud about this stuff.
You're welcome. Okay, a little more investigation seems to indicate that a referer is sent if you're using Backtalk in readonly mode and not if you're using it as an authenticated user. Maybe it's a security feature.
Getting back to the original topic, here are the details on how the URL rewriting is done. In the root web directory of my website, I created a directory called "grex" and a subdirectory of that called "bbs" (which you can see via the link http://jremmers.org/grex/). The only file in the bbs directory is a .htaccess file that specifies how anything following "grex/" is translated. (The line numbers are supplied for ease of reference and aren't actually part of the file. Also, for readability I've done some line wrapping; each number corresponds to one line of the file. You can see the actual .htaccess file at http://jremmers.org/htaccess-example.txt) ----------------------------------------------------------------------- 1. RewriteEngine on 2. RewriteRule ^/*$ http://grex.org/cgi-bin/backtalk/vanilla/conflist 3. RewriteRule ^([^/]+)/*$ http://grex.org/cgi-bin/backtalk/vanilla/browse?conf=$1 4. RewriteRule ^([^/]+)/+([0-9]+)/*$ http://grex.org/cgi-bin/backtalk/vanilla/read?conf=$1&item= $2&rsel=all 5. RewriteRule ^([^/]+)/+([0-9]+)/([^/]+)/*$ http://grex.org/cgi-bin /backtalk/vanilla/read?conf=$1&item=$2&rsel=$3 ---------------------------------------------------------------------- Line 1 tells Apache to pay attention to the rewriting rules. The next 4 lines are rewriting rules, each of the form RewriteRule PATTERN REPLACEMENT The PATTERN is a "regular expression" that specifies the form of what is to be replaced. The REPLACEMENT is what to replace anything with that matches the pattern. I won't attempt to explain regular expressions in general here (see Google or Wikipedia), but for example the regular expression "^([^/]+)/*$" matches any string of one or more non-slash characters, followed by 0 or more slashes. The parentheses around "[^/]+" tells the rewrite engine to store the string of non-slashes in a variable named "$1" which can then be referenced in the replacement. For example, in the URL "http://jremmers.org/grex/bbs/coop", the string "coop" matches the string-of-non-slashes expression "[^/]+". Since the latter is parenthesized, it's stored as "$1" and then dumped into the corresponding replacement. The rewritten URL is thus http://grex.org/cgi-bin/backtalk/vanilla/browse?conf=coop which is a standard Backtalk URL for generating the index to a conference.
Late-breaking news: I created a .htaccess file in the above-mentioned "grex" directory that makes the browsing scheme a little more hierarchical; the URL "http://jremmers.org/grex" takes you to the Grex homepage. Exercise for the technically inclined: What .htaccess file would achieve this effect?
very nice john.... wouldent have thought of doing this..... :)
What led me to do this is some recent reading about "well-designed URLs". This got me to thinking about what a simple, clean, human-friendly URL scheme for bbs items and responses might look like. For some good ideas on the issue of well-designed URLs, see Mike Schinkel's post http://www.mikeschinkel.com/blog/welldesignedurlsarebeautiful/ and the various references he gives.
Very good stuff. Thanks for the link to mikeschinkel.com. I will have to mess around with this. Isn't .htaccess strictly an Apache thing, or is it also supported by the Windows server platforms?
Response not possible - You must register and login before posting.
|
|
- Backtalk version 1.3.30 - Copyright 1996-2006, Jan Wolter and Steve Weiss