'How to correctly make a POST request to a form login and get the session-token?
I'm building a web crawler to scrape an web page information that are behind a authentication login form. So I need to login in the app with a bare HTTP request first to be able to scrape needed info.
As I know, the script need to send a POST request with login params urlencoded in addition with the csrf-token and also a unauthenticated session-cookie as a cookie header, right?
The app that I'm trying to scrape is: https://app-sandbox.kobana.com.br
Taking a look on GET request headers/body of the login page I got:
- a
metatag with namecsrf-paramwith content set toauthenticity_token - a
metatag with namecsrf-tokenand it's content - a hidden
inputtag with nameauthenticity_tokenand it's content - a unauthenticated
session-tokenthat can be retrieved fromset-cookieheader
When I do a POST request on browser, what is sent is this authenticity_token and a cookie header with multiples cookies values that I can't find in any request/response header.
So I tried to reproduce this login flow with cURL and after of send correct body/headers I receive a 302 response, which is ok. But following the redirect link I get a not_found 404 response.
I also tried to follow this flow with insomnia/postman but in this case I have a success login.
My actual code to retrieve the authenticated session-token is:
defmodule KobanaBrowserClient do
defp base_url, do: "https://app-sandbox.kobana.com.br"
defp sign_in_path, do: base_url() <> "/users/sign_in"
def sign_in(body, cookie) do
HttpClient.post(build_client(), sign_in_path(), body, headers: [{"cookie", cookie}])
end
def get_session_cookie(email, password) do
{:ok, response} = get_sign_in_page()
# Extrai o cookie não autenticado a partir do Headers
cookie = parse_session_cookie(response.headers)
{:ok, document} = Floki.parse_document(response.body)
# Extrai o token CSRF do body da requisição
authenticity_token =
document
|> Floki.find(~s(form input[name="authenticity_token"]))
|> Floki.attribute("value")
|> Floki.text()
req_body = %{
"user[email]" => email,
"user[password]" => password,
"user[remember_me]" => 0,
"authenticity_token" => authenticity_token
}
{:ok, login_response} = sign_in(req_body, cookie)
parse_session_cookie(login_response.headers)
end
def parse_session_cookie(headers) do
headers
|> Enum.reduce("", fn
{"set-cookie", cookie}, acc -> acc <> " " <> cookie
_, acc -> acc
end)
|> String.trim()
end
end
And the cURL command generated by insomnia is:
curl --request POST \
--url https://app-sandbox.kobana.com.br/users/sign_in \
--header 'Content-Type: application/x-www-form-urlencoded' \
--cookie '_boletosimples_session=eaI6y7pJ1%252BHj0CyC3%252BgWE%252FCEYV4K5mFoYANYKXCasH1LvhrmShvZUGdeYJQo3QCSXC%252FTU9s4V77Csc82FsOhX8bJz%252F%252BVRSvMKFa2I%252BHV4Nu5OeeIA9fXzooRSljtik9SvKD3of1dVY68EZXe67mXberh2i0P1eemc2ixS0Zp%252FWhaxQr7o1BOKJVL6EARV2uUL7hfwWg58z%252BHmkAZ2eGpHwrWESOuR7ALc8X%252Fk5l35RVeKIsbbcduxFqsQlh4olBJ8gxYNQVZXPy1p1VTvrMN8yGhlPGbFxzCJG4hScYMneB0Ph4%252Ba5HFA0F84hd%252FOeiWjCPTkeDDNcajHgaH8YI6grWRfS42AEziRLK6VnQuZtROue%252FNIgn%252BeCqwofW%252BdMrpbvppbc19uEWxhDCrzvit0Cc7aE1LltUzlAWJmIlrNXxUJlN7JM%252FJGL6AoMe4iYyj7%252BbRasuQ%252B9WQRycVYK%252Bxe40ZG6DqjY7wolBxw%252FC6mAoWpHyYv696UUoV1uFQahK%252BbOad8nQ9Myhvj3iFPaV3isqMMdlTWaQIEjOkgLT13dXVUxSLcrT62VtZC%252Fa5jSo31%252FDxmFfEQZv0zlW8e3eJ%252BpgPeUWBWRc%252BUMiJDMPQ1sDknka2w4gMAGRna0B7IoIxG38vkFwAQ3Ob8wPn6t4SU%252FfgZkCjmjggxs87GNXyaZjHqVyYjIr4IIODiwd9kRo1FATu%252Bms516lhNi3aF1wFqZyWS46yMlq0ZlUDSFXFI73U1ggswN2DIeOmFGBiw6ZoW%252B8CEWrZ0QOHXAyyMVGI8FL1FtaHnBp0prmvW88I--FAwVzHnQF%252BJg4NJ0--uTpgEAZoQEQdp%252BBHaJMU%252BQ%253D%253D; user.id=eyJfcmFpbHMiOnsibWVzc2FnZSI6Ik1qZzJOZz09IiwiZXhwIjpudWxsLCJwdXIiOiJjb29raWUudXNlci5pZCJ9fQ%253D%253D--4b3f7664fa367f606bcddead8e20c13202cb8415; user.expires_at=eyJfcmFpbHMiOnsibWVzc2FnZSI6IklqSXdNakl0TURNdE1UUlVNVGM2TlRVNk5ETXRNRE02TURBaSIsImV4cCI6bnVsbCwicHVyIjoiY29va2llLnVzZXIuZXhwaXJlc19hdCJ9fQ%253D%253D--5f3bcd28b81d798b5270421b5f4d610b4ec200a1; user_email=eyJfcmFpbHMiOnsibWVzc2FnZSI6IkluWnBkRzl5TG14bFlXeEFjMjlzWm1GamFXd3VZMjl0TG1KeUlnPT0iLCJleHAiOm51bGwsInB1ciI6ImNvb2tpZS51c2VyX2VtYWlsIn19--0e04fcebc5f9ce0acdbe7e92d55c19f28ab37121; user_tracking_uid=eyJfcmFpbHMiOnsibWVzc2FnZSI6IklqazJaVFUzTVRBeExUbGxOVEl0WlRFME55MW1NMlJsTFRjNFl6SmtNREF4TVRFeE1pST0iLCJleHAiOm51bGwsInB1ciI6ImNvb2tpZS51c2VyX3RyYWNraW5nX3VpZCJ9fQ%253D%253D--ee68c7d2dc9f463a4689769a1c4608158ce41463; boletosimples_visitor_guid=96e57101-9e52-e147-f3de-78c2d0011112' \
--data authenticity_token=Dj-_4TVjbSwmDjjorQvI37wMWy5pHmcrVSlkp2EkpojorZiP2a3_AxWiZ47Rkdh3pvoY55j1EIH4z-ze5qUYMg \
--data 'user[email]=account_email' \
--data 'user[password]=account_pass' \
--data 'user[remember_me]=0'
Which fails with a 404 resopnse. What can be happening?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
