• User Attivo

    Session id SMF (invisibile ai mdr??)

    Ciao a tutti, vorrei realizzare un forum di supporto per un nuova sezione del mio sito e pensavo di usare smf perchè si integra perfettamente con wordpress.
    Mi sono accorto che le url che genera hanno tutte il phpsession id finale. Ho installato la mod : seo4smf e la sessione è scomparsa dalle url, il problema è che ho letto nel forum ufficiale che sconsigliano quella mod per motivi di sicurezza.

    Voi ne sapete qualcosa?

    Poi, sempre cercando nel forum, ho letto che la sessione non viene vista dagli spider. (www . simplemachines.org/community/index.php?topic=264423.0)

    **Che cosa ne pesate?? **Avete delle esperienze in merito?


  • User Attivo

    Ho trovato un altro articolo che spiega come "ottimizzare" smf

    www . simplemachines.org/community/index.php?topic=251309.0

    Ho trovato anche un file robots.txt

    ###################################

    YouPosted.com Smart Robots v3.05

    ###################################

    This is a smart robots.txt which logs the ip and user agent of every visitor.

    Due to the compatibility issues between different bots and whether they support

    wildcards (*), multiple user-agents and end-anchors ($), I am providing different

    blocks for some.

    Detected Spider/Bot: None

    Headers Sent:

    Content-Type: text/plain

    Expires: Mon, 13 Oct 2008 03:16:05 GMT (12 hour validity)

    My Sitemap - I don't provide it just for the fun of it

    Sitemap: www . youposted.com/sitemap.xml

    Google - Most Important bot

    Unfortunately a robots.txt will only stop it crawling certain urls, and NOT adding any

    urls which it comes across into its index. So we're relying on a meta noindex tag.

    User-agent: Googlebot

    Don't index mobile versions

    Disallow: /index.php?;wap
    Disallow: /index.php?
    ;wap2
    Disallow: /index.php?*;imode

    Yahoo - Too aggressive

    So limit it as much as possible.

    User-agent: Slurp

    Disallow Everything

    Disallow: /

    Now allow bits and then disallow bits

    Allow: /sitemap.xml$
    Allow: /robots.txt$
    Allow: /index.php$
    Allow: /index.php?topic=.0$
    Allow: /index.php?topic=
    .0$
    Allow: /index.php?topic=
    .5$
    Allow: /index.php?board=
    .0$
    Allow: /index.php?board=*.0$
    Allow: /index.php?board=
    .*5$

    But don't allow these

    Disallow: /index.php?.msg
    Disallow: /index.php?topic=
    .msg0$
    Disallow: /index.php?topic=
    .msg5$
    Disallow: /index.php?
    .new

    Anything with a ; disallow

    Disallow: /index.php?;

    Arcade Related

    Allow: /index.php?action=arcade$
    Allow: /index.php?action=stats$
    Allow: /index.php?action=arcade;sa=play;game=

    Bad bot - Often ignores robots.txt - Waste of bandwidth

    Despite claiming on their website to be a search engine in development

    I'm suspicious as to whether they are a harvester pretending to be SE

    User-agent: Twiceler
    Disallow: /

    User-agent: W3C-checklink
    Disallow: /

    Stop following PHPSESSID's

    User-agent: MJ12bot
    Disallow: /index.php?PHPSESSID

    Catch all (remainder)

    Will be followed by any bots other than ones identified above

    Uses BASIC robots.txt directives without wildcards, end-anchors etc

    So Spiders should understand these (including MSNBOT)

    User-agent: *

    Default SMF Folders

    Disallow: /attachments/
    Disallow: /Packages/
    Disallow: /Smileys/
    Disallow: /Sources/
    Disallow: /Themes/

    Default SMF Actions

    Disallow: /index.php?action=activate
    Disallow: /index.php?action=admin
    Disallow: /index.php?action=calendar
    Disallow: /index.php?action=emailuser
    Disallow: /index.php?action=findmember
    Disallow: /index.php?action=help
    Disallow: /index.php?action=helpadmin
    Disallow: /index.php?action=login
    Disallow: /index.php?action=logout
    Disallow: /index.php?action=mlist
    Disallow: /index.php?action=modifykarma
    Disallow: /index.php?action=pm
    Disallow: /index.php?action=post
    Disallow: /index.php?action=printpage
    Disallow: /index.php?action=profile
    Disallow: /index.php?action=recent
    Disallow: /index.php?action=register
    Disallow: /index.php?action=reminder
    Disallow: /index.php?action=search
    Disallow: /index.php?action=theme
    Disallow: /index.php?action=unread
    Disallow: /index.php?action=unreadreplies
    Disallow: /index.php?action=verificationcode
    Disallow: /index.php?action=who
    Disallow: /index.php?theme

    SMF Mod Related

    Disallow: /archive.php
    Disallow: /index.php?action=blog
    Disallow: /index.php?action=viewblog
    Disallow: /index.php?action=chess
    Disallow: /index.php?action=comment
    Disallow: /index.php?action=downloads
    Disallow: /index.php?action=links
    Disallow: /index.php?action=reporttm
    Disallow: /index.php?action=recenttopics
    Disallow: /index.php?action=mm
    Disallow: /index.php?action=sitemap
    Disallow: /index.php?action=staff
    Disallow: /index.php?action=tags
    Disallow: /index.php?action=thankyou
    Disallow: /index.php?action=viewkarma
    Disallow: /index.php?action=viewers
    Disallow: /index.php?f=
    Disallow: /index.php?filter
    Disallow: /index.php?referredby
    Disallow: /Games/
    Disallow: /Downloads/
    Disallow: /index.php?action=arcade;favorites
    Disallow: /index.php?action=arcade;sa=highscore
    Disallow: /index.php?action=arcade;sa=play;random
    Disallow: /index.php?action=arcade;category
    Disallow: /index.php?action=arcade;sort
    Disallow: /index.php?action=arcade;stats
    Disallow: /index.php?action=stats;expand
    Disallow: /index.php?action=stats;collapse

    Ho provato per curiosità a fare il comando site:www . youposted.com (il sito che ha il file robots.text) e devo dire che è ben indicizzato. 🙂