Is possible to retrieve queryhash string for an Instagram post? #10

Closed
opened 2 years ago by youngvo · 16 comments

Hi Kali,

Thanks for writing this library.

Currently, the library is able to retrieve the queryhash string for getting hashtag's post. I'm concerned that it is possible to apply the same mechanims to retrieve the queryhash string for getting the information of a post. Please advise.

Best,
-Young

Hi Kali, Thanks for writing this library. Currently, the library is able to retrieve the queryhash string for getting hashtag's post. I'm concerned that it is possible to apply the same mechanims to retrieve the queryhash string for getting the information of a post. Please advise. Best, -Young
youngvo changed title from Is possible to retrieve queryhash for a Instagram post? to Is possible to retrieve queryhash string for an Instagram post? 2 years ago
KaKi87 commented 2 years ago
Owner

Hello Young,

In which case do you need a query hash for a post ?

As far as I know, you only need a shortcode for a post.

The shortcode is provided in all methods that returns an array of posts.

Hello Young, In which case do you need a query hash for a post ? As far as I know, you only need a shortcode for a post. The shortcode is provided in all methods that returns an array of posts.
KaKi87 added the
question
label 2 years ago
Poster

Hi Kali,

Actually, I would like to get the post without authentification by fetching the queryhash call, like this https://www.instagram.com/graphql/query/?query_hash=cf28bf5eb45d62d4dc8e77cdb99d750d&variables={"shortcode":"CNBg--UguoF","child_comment_count":3,"fetch_comment_count":40,"parent_comment_count":24,"has_threaded_comments":true}

Any ways to archive this?

Best,
-Young

Hi Kali, Actually, I would like to get the post without authentification by fetching the queryhash call, like this https://www.instagram.com/graphql/query/?query_hash=cf28bf5eb45d62d4dc8e77cdb99d750d&variables={"shortcode":"CNBg--UguoF","child_comment_count":3,"fetch_comment_count":40,"parent_comment_count":24,"has_threaded_comments":true} Any ways to archive this? Best, -Young
KaKi87 commented 2 years ago
Owner

You already can get this post without authentication using the following code :

const
    Instagram = require('./index.js'),
    client = new Instagram();
client.getPost('CNBg--UguoF')
    .then(data => console.log(JSON.stringify(data, null, 4)));

Which returns :

{
    "shortcode": "CNBg--UguoF",
    "author": {
        "id": "45246444070",
        "username": "gasshobar",
        "name": "",
        "pic": "https://scontent-cdg2-1.cdninstagram.com/v/t51.2885-19/s150x150/160252865_266456814952883_1876981082531618425_n.jpg?tp=1&_nc_ht=scontent-cdg2-1.cdninstagram.com&_nc_ohc=Dukk7cf8IKcAX_zuzZ0&ccb=7-4&oh=369aa27030e00099afdf6a822a40bc75&oe=608DBDF3&_nc_sid=83d603",
        "verified": false,
        "link": "https://www.instagram.com//gasshobar"
    },
    "location": {
        "id": "212928653",
        "name": "Miami Beach, Florida",
        "city": "Miami Beach, Florida"
    },
    "contents": [
        {
            "type": "photo",
            "url": "https://scontent-cdg2-1.cdninstagram.com/v/t51.2885-15/e35/166401645_773716946900784_2846059157129192918_n.jpg?tp=1&_nc_ht=scontent-cdg2-1.cdninstagram.com&_nc_cat=100&_nc_ohc=eAvM7scGef8AX9IWLII&ccb=7-4&oh=1a548fecc760118498654102b7a97247&oe=608D48B7&_nc_sid=83d603"
        }
    ],
    "tagged": [],
    "likes": 5,
    "caption": "delicious sushi boat🍣🍥\n.\n.\nThese services are exclusive only for maritime places🛥🛥\n.\n#miamilife #miamiboatlife #miamiboatrental #miamiboatparty  #boat #drinks #miamibeachflorida #yatchlife #yachtparty #miamibeachmarina #friends #miami #rentadebotesmiami #familiayamigos #family #boatcharters #gassho #brunch #dj #margaritas #vodka #whiskey #tequila #piñacolada #boat #mojito #florida #mar#miamibeach #bar",
    "hashtags": [
        "#miamilife",
        "#miamiboatlife",
        "#miamiboatrental",
        "#miamiboatparty",
        "#boat",
        "#drinks",
        "#miamibeachflorida",
        "#yatchlife",
        "#yachtparty",
        "#miamibeachmarina",
        "#friends",
        "#miami",
        "#rentadebotesmiami",
        "#familiayamigos",
        "#family",
        "#boatcharters",
        "#gassho",
        "#brunch",
        "#dj",
        "#margaritas",
        "#vodka",
        "#whiskey",
        "#tequila",
        "#pi",
        "#boat",
        "#mojito",
        "#florida",
        "#mar",
        "#bar"
    ],
    "mentions": null,
    "edited": true,
    "comments": [],
    "commentCount": 0,
    "timestamp": 1617066063,
    "link": "https://www.instagram.com/p/CNBg--UguoF"
}
You already can get this post without authentication using the following code : ```js const Instagram = require('./index.js'), client = new Instagram(); client.getPost('CNBg--UguoF') .then(data => console.log(JSON.stringify(data, null, 4))); ``` Which returns : ```json { "shortcode": "CNBg--UguoF", "author": { "id": "45246444070", "username": "gasshobar", "name": "", "pic": "https://scontent-cdg2-1.cdninstagram.com/v/t51.2885-19/s150x150/160252865_266456814952883_1876981082531618425_n.jpg?tp=1&_nc_ht=scontent-cdg2-1.cdninstagram.com&_nc_ohc=Dukk7cf8IKcAX_zuzZ0&ccb=7-4&oh=369aa27030e00099afdf6a822a40bc75&oe=608DBDF3&_nc_sid=83d603", "verified": false, "link": "https://www.instagram.com//gasshobar" }, "location": { "id": "212928653", "name": "Miami Beach, Florida", "city": "Miami Beach, Florida" }, "contents": [ { "type": "photo", "url": "https://scontent-cdg2-1.cdninstagram.com/v/t51.2885-15/e35/166401645_773716946900784_2846059157129192918_n.jpg?tp=1&_nc_ht=scontent-cdg2-1.cdninstagram.com&_nc_cat=100&_nc_ohc=eAvM7scGef8AX9IWLII&ccb=7-4&oh=1a548fecc760118498654102b7a97247&oe=608D48B7&_nc_sid=83d603" } ], "tagged": [], "likes": 5, "caption": "delicious sushi boat🍣🍥\n.\n.\nThese services are exclusive only for maritime places🛥🛥\n.\n#miamilife #miamiboatlife #miamiboatrental #miamiboatparty #boat #drinks #miamibeachflorida #yatchlife #yachtparty #miamibeachmarina #friends #miami #rentadebotesmiami #familiayamigos #family #boatcharters #gassho #brunch #dj #margaritas #vodka #whiskey #tequila #piñacolada #boat #mojito #florida #mar#miamibeach #bar", "hashtags": [ "#miamilife", "#miamiboatlife", "#miamiboatrental", "#miamiboatparty", "#boat", "#drinks", "#miamibeachflorida", "#yatchlife", "#yachtparty", "#miamibeachmarina", "#friends", "#miami", "#rentadebotesmiami", "#familiayamigos", "#family", "#boatcharters", "#gassho", "#brunch", "#dj", "#margaritas", "#vodka", "#whiskey", "#tequila", "#pi", "#boat", "#mojito", "#florida", "#mar", "#bar" ], "mentions": null, "edited": true, "comments": [], "commentCount": 0, "timestamp": 1617066063, "link": "https://www.instagram.com/p/CNBg--UguoF" } ```
Poster

Thanks for your kind reply. I tried the code snippet you suggested above. It always say: 404. No response returned.

Thanks for your kind reply. I tried the code snippet you suggested above. It always say: 404. No response returned.
KaKi87 commented 2 years ago
Owner

Since there is no hard-coded 404 response, this code must be coming from Instagram directly.

But, I am able to see this post myself, which is weird.

The only case I'd guess where one of two people gets a 200 response while the other gets a 404 response is when the latter has no access to the resource and the server uses secrecy instead of denial.

But, since none of us are using authentication, this situation should be impossible.

Can you try from another IP address ?

Since there is no hard-coded 404 response, this code must be coming from Instagram directly. But, I am able to see this post myself, which is weird. The only case I'd guess where one of two people gets a 200 response while the other gets a 404 response is when the latter has no access to the resource and the server uses secrecy instead of denial. But, since none of us are using authentication, this situation should be impossible. Can you try from another IP address ?
Poster

Today I tried again. It threw the error code 429. Is it from this code snippest?

						case 302: {
							switch(res.headers.location){
								case insta + 'accounts/login/':
									return reject(429);
								case insta + 'accounts/login/?next=/accounts/edit/%3F__a%3D1':
									return reject(401);
								default: {
									if(res.headers.location.startsWith(insta + 'challenge/?next='))
										return reject(409);
									reject(res.statusCode);
								}
							}
							break;
						}
Today I tried again. It threw the error code 429. Is it from this code snippest? ``` case 302: { switch(res.headers.location){ case insta + 'accounts/login/': return reject(429); case insta + 'accounts/login/?next=/accounts/edit/%3F__a%3D1': return reject(401); default: { if(res.headers.location.startsWith(insta + 'challenge/?next=')) return reject(409); reject(res.statusCode); } } break; } ```
KaKi87 commented 2 years ago
Owner

Yes : when hitting rate limits, Instagram doesn't explicitly return an HTTP 429 Too Many Requests response, but a 302 Found (temporary redirect) pointing to the login page.

Since there are different rate limits for anonymous and authenticated requests, you could be able to immediately bypass this rate limitation by authenticating now.

However, you may hit another rate limit later.

Yes : when hitting rate limits, Instagram doesn't explicitly return an HTTP `429 Too Many Requests` response, but a `302 Found` (temporary redirect) pointing to the login page. Since there are different rate limits for anonymous and authenticated requests, you could be able to immediately bypass this rate limitation by authenticating now. However, you may hit another rate limit later.
Poster

Thanks Kaki for your response.

Actually, I'm seeing in your library code, it invokes the endpoint https://www.instagram.com/p/CNBbcLJgqXG/__a=1 to retrieve the metadata of a post. However this endpoint requires an authentification. Hence, Instagram redirects you to the login page.

Without the authentification, you can make a call to this endpoint https://www.instagram.com/graphql/query/?query_hash=cf28bf5eb45d62d4dc8e77cdb99d750d&variables={"shortcode":"CNBbcLJgqXG"} to retrieve the metadata of the post. Can you add this endpoint to your library?

Thanks Kaki for your response. Actually, I'm seeing in your library code, it invokes the endpoint ```https://www.instagram.com/p/CNBbcLJgqXG/__a=1``` to retrieve the metadata of a post. However this endpoint requires an authentification. Hence, Instagram redirects you to the login page. Without the authentification, you can make a call to this endpoint ```https://www.instagram.com/graphql/query/?query_hash=cf28bf5eb45d62d4dc8e77cdb99d750d&variables={"shortcode":"CNBbcLJgqXG"}``` to retrieve the metadata of the post. Can you add this endpoint to your library?
KaKi87 commented 2 years ago
Owner

Actually, I'm seeing in your library code, it invokes the endpoint https://www.instagram.com/p/CNBbcLJgqXG/__a=1

No, the endpoint is only called when you're authenticated, see :

__a: sessionId ? '1' : undefined

When not authenticated, the full browser page will be scraped, see :

Object.values(Object.values(JSON.parse(body.match(/_sharedData = (.+);/)[1])['entry_data'])[0][0]['graphql'])[0]);

Without the authentification, you can make a call to this endpoint https://www.instagram.com/graphql/query/?query_hash=cf28bf5eb45d62d4dc8e77cdb99d750d&variables={"shortcode":"CNBbcLJgqXG"}

I see the endpoint works indeed, but in order to scrape the query hash of any client, can you please tell me on which page, doing which action, did you triggered this GraphQL query ?

Thanks.

> Actually, I'm seeing in your library code, it invokes the endpoint https://www.instagram.com/p/CNBbcLJgqXG/__a=1 No, the endpoint is only called when you're authenticated, see : ```js __a: sessionId ? '1' : undefined ``` When not authenticated, the full browser page will be scraped, see : ```js Object.values(Object.values(JSON.parse(body.match(/_sharedData = (.+);/)[1])['entry_data'])[0][0]['graphql'])[0]); ``` > Without the authentification, you can make a call to this endpoint https://www.instagram.com/graphql/query/?query_hash=cf28bf5eb45d62d4dc8e77cdb99d750d&variables={"shortcode":"CNBbcLJgqXG"} I see the endpoint works indeed, but in order to scrape the query hash of any client, can you please tell me on which page, doing which action, did you triggered this GraphQL query ? Thanks.
Poster

When not authenticated, the full browser page will be scraped, see :

Object.values(Object.values(JSON.parse(body.match(/_sharedData = (.+);/)[1])['entry_data'])[0][0]['graphql'])[0])

Looks like this one is not working as expected. It always return 429 instead of scraping the web page.

I see the endpoint works indeed, but in order to scrape the query hash of any client, can you please tell me on which page, doing which action, did you triggered this GraphQL query ?

When you view posts under a hashtag, for example: https://www.instagram.com/explore/tags/miamiboatlife/. You click on one post to see the detail. Then, you navigate to the next post. Open the web inspector, you will see it calls the endpoint https://www.instagram.com/graphql/query/?query_hash=cf28bf5eb45d62d4dc8e77cdb99d750d&variables={"shortcode":"CNBbcLJgqXG"}

> When not authenticated, the full browser page will be scraped, see : > > Object.values(Object.values(JSON.parse(body.match(/_sharedData = (.+);/)[1])['entry_data'])[0][0]['graphql'])[0]) Looks like this one is not working as expected. It always return 429 instead of scraping the web page. > I see the endpoint works indeed, but in order to scrape the query hash of any client, can you please tell me on which page, doing which action, did you triggered this GraphQL query ? When you view posts under a hashtag, for example: https://www.instagram.com/explore/tags/miamiboatlife/. You click on one post to see the detail. Then, you navigate to the next post. Open the web inspector, you will see it calls the endpoint `https://www.instagram.com/graphql/query/?query_hash=cf28bf5eb45d62d4dc8e77cdb99d750d&variables={"shortcode":"CNBbcLJgqXG"}`
KaKi87 commented 2 years ago
Owner

Looks like this one is not working as expected. It always return 409 instead of scraping the web page.

Do you mean 429 ?

Because 409 is something else.

The library returns 429 when the client is redirected to the login page.

The library returns 409 when the client is redirected to the captcha page.

The latter would only happen when you're authenticated.

view posts under a hashtag [...] Then, you navigate to the next post

I'll try that, thanks.

> Looks like this one is not working as expected. It always return 409 instead of scraping the web page. Do you mean 4**2**9 ? Because 4**0**9 is something else. The library returns 429 when the client is redirected to the login page. The library returns 409 when the client is redirected to the captcha page. The latter would only happen when you're authenticated. > view posts under a hashtag [...] Then, you navigate to the next post I'll try that, thanks.
Poster

yes, I meant 429. Thanks for your attention to add the new endpoint.

yes, I meant 429. Thanks for your attention to add the new endpoint.
KaKi87 commented 2 years ago
Owner

I just added a parameter to the getPost method so you can use a query hash on demand. (fc39063)

client.getPost('CNBbcLJgqXG', { useGraphQL: true });

You should know, though, that GraphQL is also rate-limited by Instagram, just like non-GraphQL requests.

I just added a parameter to the `getPost` method so you can use a query hash on demand. (fc39063) ```js client.getPost('CNBbcLJgqXG', { useGraphQL: true }); ``` You should know, though, that GraphQL is also rate-limited by Instagram, just like non-GraphQL requests.
Poster

Appricate for your quick reaction on this.

-Young

Appricate for your quick reaction on this. -Young
Poster

that GraphQL is also rate-limited by Instagram

Do you know exactly the rate limit of Graphql request. Is it 200 request per hour per IP, right? I'm not sure about the mechanims that Insgram is applying. Any info would be helpful.

> that GraphQL is also rate-limited by Instagram Do you know exactly the rate limit of Graphql request. Is it 200 request per hour per IP, right? I'm not sure about the mechanims that Insgram is applying. Any info would be helpful.
KaKi87 commented 2 years ago
Owner

Unfortunately, I have absolutely no idea which rate limits Instagram is applying on any of its endpoints.

Unfortunately, I have absolutely no idea which rate limits Instagram is applying on any of its endpoints.
KaKi87 closed this issue 2 years ago
This repo is archived. You cannot comment on issues.
No Milestone
No Assignees
2 Participants
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: KaKi87/scraper-instagram-v1#10
Loading…
There is no content yet.