[wp-hackers] wp_remote_request not telling me the 301'd URL
Edward de Leau
e at leau.net
Sat Mar 12 04:07:14 UTC 2011
thanks!
On Sat, Mar 12, 2011 at 4:50 AM, Jacob Santos <wordpress at santosj.name>wrote:
> 1. Check content-type, if exists. If it is "text/html" then run the filter
> to get the favicon.ico.
>
> 2. Oh my god, who would have thought an use case like this would have come
> up?
>
> 3. You need to look for "Refresh" header as well. Some web servers (IIS)
> will send Refresh instead of Location as well as web sites with a redirect
> message for systems that do not support redirects.
>
> Jacob Santos
>
> On Fri, Mar 11, 2011 at 2:09 PM, Edward de Leau <e at leau.net> wrote:
>
> > I have implemented manual redirection for the wp-favicons plugin here:
> >
> >
> http://plugins.trac.wordpress.org/browser/wp-favicons/trunk/includes/class-http.php
> >
> > (part of next version 0.5.1 where a mouseover over a redirect/tiny url
> > shows
> > the url it redirects to)
> > I redirect 5 times max.
> >
> > e.g. (1) nu.nl (301) ---> (2) www.nu.nl (200) --> (3)
> > www.nu.nl/images/favicon.ico (200)
> >
> > I needed the manual redirection because I needed the base_href when no
> > base_href is given in the HTML source.
> > I then need the redirected URI to use that as base_href
> >
> > Code is not completely done since a use case like:
> >
> > e.g. (1) newscred.com (301) --> (2) http://platform.newscred.com (200)
> > (look
> > in page) --> (3) http://newscred.com/favicon.ico (301) (wtf? redirect of
> > content in page) -->
> > (4) http://newspapers.newscred.com/favicon.ico (200) --> (5)
> > http://newspapers.newscred.com//media/img/favicon.ico (200)
> >
> > does not work yet since this site gives 4 as redirect url while (4) is
> > actually a page. So i need to add another check for binary content in the
> > beginning.
> >
> > But for all none favicon self-redirection this should work.
> >
> >
> >
> >
> >
> > On Tue, Mar 1, 2011 at 12:31 AM, Scott Kingsley Clark <scott at skcdev.com
> > >wrote:
> >
> > > The spidering process can really take a lot of time for a large site,
> and
> > > can end up eating resources and adding time to the infamous php
> > > max_execution_time so I was looking to cut corners. If I've gotta do
> two
> > > requests to do this, I'll do it. Thanks for the advice and attention.
> > >
> > > -Scott
> > >
> > > On Monday, February 28, 2011 5:28:54 PM UTC-6, Jacob Santos wrote:
> > > >
> > > > Not really. The wp_remote_request simply defaults to GET, you can
> > change
> > > it
> > > > to be HEAD, which is what it seems like you are wanting anyway. You
> can
> > > > check to see if it is a redirect and then send another request. It
> does
> > > not
> > > > sound like speed is a concern (albeit one factor since many sites can
> > > quite
> > > > frankly get up there with the amount of redirects given Canonical
> URLs
> > > > might
> > > > give you (Hint: Should be at most 2 requests, one for the redirect
> and
> > > one
> > > > for the actual page).
> > > >
> > > > You'll probably want to use wp_remote_head() instead, since
> > > > wp_remote_request() is a generic function made to accommodated the
> rest
> > > of
> > > > the HTTP and HTTP extensions (there isn't any built-in calls support
> > for
> > > > Subversion or webdav).
> > > >
> > > > Jacob Santos
> > > >
> > > > On Mon, Feb 28, 2011 at 5:22 PM, Scott Kingsley Clark <
> > sc... at skcdev.com
> > > > >wrote:
> > > >
> > > > > Actually, this is in regards to a plugin I'm currently developing.
> > It's
> > > > in
> > > > > Beta right now but it's available on WP.org. It's called Search
> > Engine
> > > > and
> > > > > it's like a mini-Google on your site. It spiders your site (or
> other
> > > > sites
> > > > > too) and indexes content into the DB.
> > > > >
> > > > > http://wordpress.org/extend/plugins/search-engine/
> > > > >
> > > > > <http://wordpress.org/extend/plugins/search-engine/>The use-case
> is
> > > that
> > > > I
> > > > > want to be able to tell whether a page that's linked to on a site,
> is
> > > > > really
> > > > > redirected elsewhere. Right now, since I switched to
> > wp_remote_request,
> > > I
> > > > > only get the content of the final destination page, without any
> > > knowledge
> > > > > of
> > > > > the path it's taken. So the best my script (or any script) can tell
> > is
> > > > that
> > > > > when you get content using wp_remote_request and it's redirected,
> > there
> > > > > page
> > > > > exists at the URL requested -- oblivious to the real redirect
> > > happening.
> > > > > Previously I was using a home-brewed version similar
> > > > > to wp_remote_request but calling cURL and others manually).
> > > > >
> > > > > So it looks like right now I'll need to do a little extra code to
> > make
> > > my
> > > > > own wp_remote_request like function which does both the 301/302
> > > redirect
> > > > > headers check and the body content return.
> > > > >
> > > > > -Scott
> > > > >
> > > > > On Monday, February 28, 2011 5:11:22 PM UTC-6, Dion Hulse (dd32)
> > wrote:
> > > > > >
> > > > > > 2 separate requests will be 2 separate requests.
> > > > > > What's the use-case you're working on here?
> > > > > > Personally, I'd do a normal fetch, followed by a head if it was a
> > > > > > exceeded-redirects error if you want the body, otherwise, the
> url..
> > > > > > But i cant think of a case where you'd want one or the other..
> > > > > >
> > > > > > On 1 March 2011 04:06, Scott Kingsley Clark <sc... at skcdev.com>
> > > wrote:
> > > > > >
> > > > > > > Not sure if anyone knows this, but does the page get loaded
> twice
> > > or
> > > > is
> > > > > > the
> > > > > > > second time getting loaded from some sort of cache? I'm
> > > specifically
> > > > > > > calling
> > > > > > > to the idea of using wp_remote_head on a URL to check for a
> > > redirect,
> > > > > and
> > > > > > > then using wp_remote_request on the same URL to get the content
> /
> > > > etc.
> > > > > > > _______________________________________________
> > > > > > > wp-hackers mailing list
> > > > > > > wp-h... at lists.automattic.com
> > > > > > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > > > > > >
> > > > > > >
> > > > > > _______________________________________________
> > > > > > wp-hackers mailing list
> > > > > > wp-h... at lists.automattic.com
> > > > > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > > > > >
> > > > > >
> > > > >
> > > > > _______________________________________________
> > > > > wp-hackers mailing list
> > > > > wp-ha... at lists.automattic.com
> > > > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > > > >
> > > > >
> > > > _______________________________________________
> > > > wp-hackers mailing list
> > > > wp-ha... at lists.automattic.com
> > > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > > >
> > > >
> > >
> > > _______________________________________________
> > > wp-hackers mailing list
> > > wp-hackers at lists.automattic.com
> > > http://lists.automattic.com/mailman/listinfo/wp-hackers
> > >
> > >
> > _______________________________________________
> > wp-hackers mailing list
> > wp-hackers at lists.automattic.com
> > http://lists.automattic.com/mailman/listinfo/wp-hackers
> >
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers
>
More information about the wp-hackers
mailing list