- Home
- Categorie
- Coding e Sistemistica
- CMS & Piattaforme Self-Hosted
- Session id SMF (invisibile ai mdr??)
Session id SMF (invisibile ai mdr??)
Ciao a tutti, vorrei realizzare un forum di supporto per un nuova sezione del mio sito e pensavo di usare smf perchè si integra perfettamente con wordpress.
Mi sono accorto che le url che genera hanno tutte il phpsession id finale. Ho installato la mod : seo4smf e la sessione è scomparsa dalle url, il problema è che ho letto nel forum ufficiale che sconsigliano quella mod per motivi di sicurezza.Voi ne sapete qualcosa?
Poi, sempre cercando nel forum, ho letto che la sessione non viene vista dagli spider. (www . simplemachines.org/community/index.php?topic=264423.0)
**Che cosa ne pesate?? **Avete delle esperienze in merito?
Ho trovato un altro articolo che spiega come "ottimizzare" smf
www . simplemachines.org/community/index.php?topic=251309.0
Ho trovato anche un file robots.txt
YouPosted.com Smart Robots v3.05
This is a smart robots.txt which logs the ip and user agent of every visitor.
Due to the compatibility issues between different bots and whether they support
wildcards (*), multiple user-agents and end-anchors ($), I am providing different
blocks for some.
Detected Spider/Bot: None
Headers Sent:
Content-Type: text/plain
Expires: Mon, 13 Oct 2008 03:16:05 GMT (12 hour validity)
My Sitemap - I don't provide it just for the fun of it
Sitemap: www . youposted.com/sitemap.xml
Google - Most Important bot
Unfortunately a robots.txt will only stop it crawling certain urls, and NOT adding any
urls which it comes across into its index. So we're relying on a meta noindex tag.
User-agent: Googlebot
Don't index mobile versions
Disallow: /index.php?;wap
Disallow: /index.php?;wap2
Disallow: /index.php?*;imodeYahoo - Too aggressive
So limit it as much as possible.
User-agent: Slurp
Disallow Everything
Disallow: /
Now allow bits and then disallow bits
Allow: /sitemap.xml$
Allow: /robots.txt$
Allow: /index.php$
Allow: /index.php?topic=.0$
Allow: /index.php?topic=.0$
Allow: /index.php?topic=.5$
Allow: /index.php?board=.0$
Allow: /index.php?board=*.0$
Allow: /index.php?board=.*5$But don't allow these
Disallow: /index.php?.msg
Disallow: /index.php?topic=.msg0$
Disallow: /index.php?topic=.msg5$
Disallow: /index.php?.newAnything with a ; disallow
Disallow: /index.php?;
Arcade Related
Allow: /index.php?action=arcade$
Allow: /index.php?action=stats$
Allow: /index.php?action=arcade;sa=play;game=Bad bot - Often ignores robots.txt - Waste of bandwidth
Despite claiming on their website to be a search engine in development
I'm suspicious as to whether they are a harvester pretending to be SE
User-agent: Twiceler
Disallow: /User-agent: W3C-checklink
Disallow: /Stop following PHPSESSID's
User-agent: MJ12bot
Disallow: /index.php?PHPSESSIDCatch all (remainder)
Will be followed by any bots other than ones identified above
Uses BASIC robots.txt directives without wildcards, end-anchors etc
So Spiders should understand these (including MSNBOT)
User-agent: *
Default SMF Folders
Disallow: /attachments/
Disallow: /Packages/
Disallow: /Smileys/
Disallow: /Sources/
Disallow: /Themes/Default SMF Actions
Disallow: /index.php?action=activate
Disallow: /index.php?action=admin
Disallow: /index.php?action=calendar
Disallow: /index.php?action=emailuser
Disallow: /index.php?action=findmember
Disallow: /index.php?action=help
Disallow: /index.php?action=helpadmin
Disallow: /index.php?action=login
Disallow: /index.php?action=logout
Disallow: /index.php?action=mlist
Disallow: /index.php?action=modifykarma
Disallow: /index.php?action=pm
Disallow: /index.php?action=post
Disallow: /index.php?action=printpage
Disallow: /index.php?action=profile
Disallow: /index.php?action=recent
Disallow: /index.php?action=register
Disallow: /index.php?action=reminder
Disallow: /index.php?action=search
Disallow: /index.php?action=theme
Disallow: /index.php?action=unread
Disallow: /index.php?action=unreadreplies
Disallow: /index.php?action=verificationcode
Disallow: /index.php?action=who
Disallow: /index.php?themeSMF Mod Related
Disallow: /archive.php
Disallow: /index.php?action=blog
Disallow: /index.php?action=viewblog
Disallow: /index.php?action=chess
Disallow: /index.php?action=comment
Disallow: /index.php?action=downloads
Disallow: /index.php?action=links
Disallow: /index.php?action=reporttm
Disallow: /index.php?action=recenttopics
Disallow: /index.php?action=mm
Disallow: /index.php?action=sitemap
Disallow: /index.php?action=staff
Disallow: /index.php?action=tags
Disallow: /index.php?action=thankyou
Disallow: /index.php?action=viewkarma
Disallow: /index.php?action=viewers
Disallow: /index.php?f=
Disallow: /index.php?filter
Disallow: /index.php?referredby
Disallow: /Games/
Disallow: /Downloads/
Disallow: /index.php?action=arcade;favorites
Disallow: /index.php?action=arcade;sa=highscore
Disallow: /index.php?action=arcade;sa=play;random
Disallow: /index.php?action=arcade;category
Disallow: /index.php?action=arcade;sort
Disallow: /index.php?action=arcade;stats
Disallow: /index.php?action=stats;expand
Disallow: /index.php?action=stats;collapseHo provato per curiosità a fare il comando site:www . youposted.com (il sito che ha il file robots.text) e devo dire che è ben indicizzato.