Codingrewrite rule medo

 

Press Ctrl+Enter to quickly submit your post
Quick Reply  
 
 
  
 From:  CHYRON (DSMITHHFX)  
 To:  ALL
41883.1 
Code: 
RewriteEngine on
RewriteCond %{REQUEST_URI} !^.*/mobile/.*$ [PT]
RewriteCond %{REQUEST_URI} !^.*/index.html$ [PT]
RewriteCond %{REQUEST_URI} ^.*\.html$ [PT]
RewriteRule /([a-z]+)-*(.*)\.html$ /#$1/$2  [R=301,L,NE,QSA]
Works perfectly* except in case of filenames without a hyphen, it adds a trailing slash. Which I don't want & can't figure out why it's doing that or how to stop it.

http://domain.com/directory/intro-something.html rewrites to http://domain.com/directory/#intro/something *good*

http://domain.com/directory/intro.html rewrites to http://domain.com/directory/#intro/ *bad*

* in limited testing

Edit: I think rewrite rule needs to say /#$1/$2 if there's a hyphen and /$1 if there ain't. Cause there's your slash.

This cluttery shite seems to work [better]:
 
Code: 
RewriteEngine on
#hyphenated
RewriteCond %{REQUEST_URI} !^.*/mobile/.*$ [PT]
RewriteCond %{REQUEST_URI} !^.*/index.html$ [PT]
RewriteCond %{REQUEST_URI} ^.*\.html$ [PT]
RewriteCond %{REQUEST_URI} ^.*/.*-.*\.html$ [PT]
RewriteRule /([a-z]+)-(.*)\.html$ /#$1/$2  [R=301,L,NE,QSA]

#not
RewriteCond %{REQUEST_URI} !^.*/mobile/.*$ [PT]
RewriteCond %{REQUEST_URI} !^.*/index.html$ [PT]
RewriteCond %{REQUEST_URI} ^.*\.html$ [PT]
RewriteRule /([a-z]+)\.html$ /#$1  [R=301,L,NE,QSA]
“Human Resources Startup Zenefits Is Laying Off Almost Half Its Employees”
0/0
 Reply   Quote More 

 From:  Peter (BOUGHTONP)  
 To:  CHYRON (DSMITHHFX)     
41883.2 In reply to 41883.1 
Stop using conditions you don't need.
RewriteRule ^.*/mobile/ - [L,PT]
RewriteRule ^.*/index.html$ - [L,PT]
RewriteRule /([a-z]+)-([^-]+)\.html$ /#$1/$2 [L,R=301,NE,QSA]
RewriteRule /([a-z]+)\.html$ /#$1 [L,R=301,NE,QSA]
0/0
 Reply   Quote More 

 From:  CHYRON (DSMITHHFX)  
 To:  Peter (BOUGHTONP)     
41883.3 In reply to 41883.2 
Nice! I'll test it on Monday. Thanks
“Human Resources Startup Zenefits Is Laying Off Almost Half Its Employees”
0/0
 Reply   Quote More 

 From:  Peter (BOUGHTONP)  
 To:  CHYRON (DSMITHHFX)     
41883.4 In reply to 41883.3 
If I was more awake when I wrote that, I would have pointed out that the RewriteRule pattern is checked before the associated RewriteCond conditions, so in addition to being simpler and less code duplication, it reduces the number of unnecessary checks too.

Possibly a clearer way of explaining is that RewriteCond is not like an IF statement, but rather it's additional filtering checked only if the RewriteRule pattern is a match (but before the replacement/rewriting occurs).

0/0
 Reply   Quote More 

 From:  CHYRON (DSMITHHFX)  
 To:  Peter (BOUGHTONP)     
41883.5 In reply to 41883.4 
Worked like a charm!

This part is confusing to me:
Quote: 
RewriteCond is not like an IF statement, but rather it's additional filtering checked only if the RewriteRule pattern is a match
Apache docs puts it this way:
Code: 
...RewriteCond directives can be used to restrict the types of requests that will be subject to the following RewriteRule. 


Now I *might* need RewriteCond's, to filter out search engines from the rewrite. So far this ain't doin nuthin':
Quote: 
RewriteCond %{HTTP_USER_AGENT} !^(google|yahoo|bing) [NC]
RewriteCond %{HTTP_REFERER} !^(google|yahoo|bing) [NC]



 
“Human Resources Startup Zenefits Is Laying Off Almost Half Its Employees”
0/0
 Reply   Quote More 

 From:  Peter (BOUGHTONP)  
 To:  CHYRON (DSMITHHFX)     
41883.6 In reply to 41883.5 
The Apache docs would be more accurate if they said "...subject to rewriting by the following RewriteRule".

It is disappointing the docs don't mention it at all - since it's not obvious, but anyhow the easiest way to prove it is the source: apply_rewrite_rule in mod_rewrite.c

RewriteRule matching is preceded by this comment...

    /* Try to match the URI against the RewriteRule pattern
     * and exit immediately if it didn't apply.
     */

And after that we have this...

    /* Ok, we already know the pattern has matched, but we now
     * additionally have to check for all existing preconditions
     * (RewriteCond) which have to be also true. We do this at
     * this very late stage to avoid unnecessary checks which
     * would slow down the rewriting engine.
     */     

Curious to see performance given as the reason, since arguably simpler string header checks could be cheaper than the convoluted regexes that can occur - having the option to choose when the condition applied would allow the best performance.

0/0
 Reply   Quote More 

 From:  Peter (BOUGHTONP)  
 To:  CHYRON (DSMITHHFX)     
41883.7 In reply to 41883.5 
As for the search engine stuff, the caret (^) is anchoring your match to the start of the string, but the Googlebot useragent is "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" so remove the caret. Also you shouldn't need the parentheses - the ! is a prefix, so try just "RewriteCond %{HTTP_USER_AGENT} !google|yahoo|bing [NC]"

Is it not simpler to use robots.txt to block them?

0/0
 Reply   Quote More 

 From:  CHYRON (DSMITHHFX)  
 To:  Peter (BOUGHTONP)     
41883.8 In reply to 41883.7 
The object is to allow search engines to crawl unrewritten *.html urls (except index, and those in mobile/), and to rewrite human-submitted urls (from search results) with *.html suffix to the hashed urls -- I've got it all working with javascript redirects, but I think intercepting it before anything gets served would be preferable. Suffice to say it's become an academic exercise as the client has decided they don't want the app to be searchable after all. Now I just want to see if I can get the htaccess method to work.
“Human Resources Startup Zenefits Is Laying Off Almost Half Its Employees”
0/0
 Reply   Quote More 

 From:  CHYRON (DSMITHHFX)  
 To:  ALL
41883.9 
So here's what ended up testing out on two different Apache 2.2 servers

OS X development server on powermac G5 (Apache installed through macports), localhost:8081 pointed at virtualhost:
Code: 
RewriteEngine On
RewriteBase /

RewriteCond %{HTTP_USER_AGENT} !google|yahoo|bing [NC]
RewriteCond %{HTTP_REFERER} !google|yahoo|bing [NC]
RewriteCond %{REQUEST_URI} !^.*/mobile/
RewriteCond %{REQUEST_FILENAME} !^/index.html$
RewriteRule ^([a-z]+)-(.+)\.html$ /#$1/$2 [NE,R=301,L]

RewriteCond %{REQUEST_URI} !^.*/#[a-z]+/[.*]$
RewriteRule ^([a-z]+)\.html$ /#$1 [NE,R=301,L]
Staging server on Ubuntu 14.04 ppc (powermac G4), hosted in an "seo2" subdirectory:
Code: 
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} !google|yahoo|bing [NC]
RewriteCond %{HTTP_REFERER} !google|yahoo|bing [NC]
RewriteCond %{REQUEST_URI} !^.*/mobile/.*$
RewriteCond %{REQUEST_URI} !^.*/index.html$
RewriteRule ([a-z]+)-(.+)\.html$ /seo2/#$1/$2 [NE,R,L]

RewriteCond %{HTTP_USER_AGENT} !google|yahoo|bing [NC]
RewriteCond %{HTTP_REFERER} !google|yahoo|bing [NC]
RewriteCond %{REQUEST_URI} !^.*/mobile/.*$
RewriteCond %{REQUEST_URI} !^.*/#[a-z]+/[.*]$
RewriteRule ([a-z]+)\.html$ /seo2/#$1 [NE,R,L]
Not found any good online htaccess documentation or tutorials (relied a lot on stackoverflow), so these evolved through a lot of trial and (mostly) error.

htaccess seemed pretty erratic and unreliable on the staging server with subdirectory, with frequent browser cache-clearing required or sometimes just waiting a few hours.
“Human Resources Startup Zenefits Is Laying Off Almost Half Its Employees”
0/0
 Reply   Quote More 

Reply to All    
 

1–9

Rate my interest:

Adjust text size : Smaller 10 Larger

Beehive Forum 1.5.2 |  FAQ |  Docs |  Support |  Donate! ©2002 - 2024 Project Beehive Forum

Forum Stats