{"id":15098,"date":"2023-01-01T12:15:20","date_gmt":"2023-01-01T12:15:20","guid":{"rendered":"http:\/\/www.max-sperling.bplaced.net\/?p=15098"},"modified":"2024-02-16T10:44:08","modified_gmt":"2024-02-16T10:44:08","slug":"download-files-from-webpage-not-listed-in-source","status":"publish","type":"post","link":"http:\/\/www.max-sperling.bplaced.net\/?p=15098","title":{"rendered":"Download files from page (not listed in source)"},"content":{"rendered":"<p><strong>Scenario<\/strong><\/p>\n<p>You scroll a webpage an it loads files step by step and you want to download them. You check the source code of that page and the files are not listed there.<\/p>\n<hr>\n<p><strong>Download<\/strong><\/p>\n<p>The following PowerShell script shows how to download the files based on the captured HAR file.<\/p>\n<pre>\r\nfunction Search-Items {\r\n    param ([string]$File, [string]$Pattern)\r\n\r\n    Select-String -Path $File $Pattern -AllMatches | Foreach-Object {\r\n        $Items += @($_.Matches.Value)\r\n    }\r\n\r\n    return $Items\r\n}\r\n\r\nfunction Download-Items {\r\n    param ([Parameter(ValueFromPipeline)] [string[]]$Items)\r\n\r\n    process {\r\n        foreach ($Item in $Items) {\r\n            Write-Output $Item\r\n            $File = $Item.Substring($Item.LastIndexOf(\"\/\") + 1)\r\n            Invoke-WebRequest $Item -OutFile $File\r\n        }\r\n    }\r\n}\r\n\r\n$ProgressPreference = \"SilentlyContinue\" # Suppresses progress bar\r\n\r\n$HARFile = \"website.com.har\"\r\n$Pattern = \"https:\/\/website.com\/[^,]*`.jpg\"\r\n\r\nSearch-Items $HARFile $Pattern | Select-Object -Unique | Download-Items\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Scenario You scroll a webpage an it loads files step by step and you want to download them. You check<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false},"categories":[55],"tags":[],"_links":{"self":[{"href":"http:\/\/www.max-sperling.bplaced.net\/index.php?rest_route=\/wp\/v2\/posts\/15098"}],"collection":[{"href":"http:\/\/www.max-sperling.bplaced.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.max-sperling.bplaced.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.max-sperling.bplaced.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.max-sperling.bplaced.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=15098"}],"version-history":[{"count":1,"href":"http:\/\/www.max-sperling.bplaced.net\/index.php?rest_route=\/wp\/v2\/posts\/15098\/revisions"}],"predecessor-version":[{"id":16824,"href":"http:\/\/www.max-sperling.bplaced.net\/index.php?rest_route=\/wp\/v2\/posts\/15098\/revisions\/16824"}],"wp:attachment":[{"href":"http:\/\/www.max-sperling.bplaced.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=15098"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.max-sperling.bplaced.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=15098"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.max-sperling.bplaced.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=15098"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}