`
天梯梦
  • 浏览: 13729359 次
  • 性别: Icon_minigender_2
  • 来自: 洛杉矶
社区版块
存档分类
最新评论

获取远程文件内容之浏览器模拟器(BrowserEmulator)

 
阅读更多

出于安全的考虑,常常会关闭fopen, file_get_contents, 也就是会把 allow_url_fopen设置为OFF,如果想要继续使用这些函数,就可以用到这个类。

 

 

<?php


/* used for the transmission RPC connection 
 * and the SABnzbd+ file submit 
 */


/***************************************************************************


Browser Emulating file functions v2.0.1-torrentwatch
(c) Kai Blankenhorn
www.bitfolge.de/browseremulator
kaib@bitfolge.de




This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.


This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.


You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.


****************************************************************************


Changelog:


v2.0.1-torrentwatch2 by Erik Bernhardson
  multi-part post with file submit


v2.0.1-torrentwatch by Erik Bernhardson
  converted file() to file_get_contents()
  converted lastResponse to string from array to mimic file_get_contents
  added gzip compression support


v2.0.1
  fixed authentication bug
  added global debug switch


v2.0   03-09-03
  added a wrapper class; this has the advantage that you no longer need
    to specify a lot of parameters, just call the methods to set
    each option
  added option to use a special port number, may be given by setPort or
    as part of the URL (e.g. server.com:80)
  added getLastResponseHeaders()


v1.5
  added Basic HTTP user authorization
  minor optimizations


v1.0
  initial release






***************************************************************************/


/**
* BrowserEmulator class. Provides methods for opening urls and emulating
* a web browser request.
**/
class BrowserEmulator {
  var $headerLines = Array();
  var $postData = Array();
  var $multiPartPost = False;
  var $authUser = "";
  var $authPass = "";
  var $port;
  var $lastResponse = '';
  var $lastRequest = '';
  var $debug = false;
  var $customHttp = False;
 
  public function BrowserEmulator() {
    $this->resetHeaderLines();
    $this->resetPort();
  }
    /**
  * Adds a single header field to the HTTP request header. The resulting header
  * line will have the format
  * $name: $value\n
  **/
  public function addHeaderLine($name, $value) {
    $this->headerLines[$name] = $value;
  }
 
  /**
  * Deletes all custom header lines. This will not remove the User-Agent header field,
  * which is necessary for correct operation.
  **/
  public function resetHeaderLines() {
    $this->headerLines = Array();
   
    /*******************************************************************************/
    /**************   YOU MAX SET THE USER AGENT STRING HERE   *******************/
    /*                                                   */
    /* default is "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)",         */
    /* which means Internet Explorer 6.0 on WinXP                       */
   
    $this->headerLines["User-Agent"] = 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.10) Gecko/2009042315 Firefox/3.0.10';


    /*******************************************************************************/
    /**
    * Set default to accept gzip encoded files
    */
    $this->headerLines["Accept-Encoding"] = "*/*";
  }
 
  /**
  * Add a post parameter. Post parameters are sent in the body of an HTTP POST request.
  **/
  public function addPostData($name, $value = '') {
    $this->postData[$name] = $value;
  }
 
  /**
  * Deletes all custom post parameters.
  **/
  public function resetPostData() {
    $this->postData = Array();
  }


  public function handleMultiPart() {
    $boundry = '----------------------------795088511166260704540879626';


    $this->headerLines["Accept"] = ' text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8';
    $this->headerLines["Connection"] = 'Close';
    $this->headerLines["Content-Type"] = "multipart/form-data; boundary=$boundry";
    $out = '';
    foreach($this->postData as $item => $data) {
      if(is_array($data)) {
        $out .= "--$boundry\r\n"
               ."Content-Disposition: form-data; name=\"$item\"; filename=\"{$data['filename']}\"\r\n"
               ."Content-Type: application/octet-stream\r\n"
               ."\r\n"
               .$data['contents']."\r\n";
      } else {
        $out .= "--$boundry\r\n"
               ."Content-Disposition: form-data; name=\"$item\"\r\n"
               ."\r\n"
               .$data."\r\n";
      }
    }
    $out .= "--{$boundry}--\r\n";
    return $out;
  }


  /**
  * Sets an auth user and password to use for the request.
  * Set both as empty strings to disable authentication.
  **/
  public function setAuth($user, $pass) {
    $this->authUser = $user;
    $this->authPass = $pass;
  }
  /**
  * Selects a custom port to use for the request.
  **/
  public function setPort($portNumber) {
    $this->port = $portNumber;
  }
 
  /**
  * Resets the port used for request to the HTTP default (80).
  **/
  public function resetPort() {
    $this->port = 80;
  }


  /**
   * Parse any cookies set in the URL, and return the trimed string
   **/
  public function preparseURL($url) {
    if($cookies = stristr($url, ':COOKIE:')) {
      $url = rtrim(substr($url, 0, -strlen($cookies)), '&');
      $this->addHeaderLine("Cookie", '$Version=1; '.strtr(substr($cookies, 8), '&', ';'));
    }
    return $url;
  }


  /**
  * Make an fopen call to $url with the parameters set by previous member
  * method calls. Send all set headers, post data and user authentication data.
  * Returns a file handle on success, or false on failure.
  **/
  public function fopen($url) {
    $url = $this->preparseURL($url);
    $this->lastResponse = Array();
   
    $parts = parse_url($url);
    $protocol = $parts['scheme'];
    $server = $parts['host'];
    $port = $parts['port'];
    $path = $parts['path'];
    if(isset($parts['query'])) {
      $path .= '?'.$parts['query'];
    }


    if($protocol == 'https') {
      // TODO: https is locked to port 443, why?
      $server = 'ssl://'.$server;
      $this->setPort(443);
    } elseif ($port!="") {
        $this->setPort($port);
    }
    if ($path=="") $path = "/";
    $socket = false;
    $socket = fsockopen($server, $this->port);
    if ($socket) {
        if ($this->authUser!="" && $this->authPass!="") {
          $this->headerLines["Authorization"] = "Basic ".base64_encode($this->authUser.":".$this->authPass);
        }
      
        if($this->customHttp)
          $request = $this->customHttp." $path\r\n";
        elseif (count($this->postData)==0)
          $request = "GET $path HTTP/1.0\r\n";
        else
          $request = "POST $path HTTP/1.1\r\n";


        $request .= "Host: {$parts['host']}\r\n";
       
        if ($this->debug) echo $request;
        if (count($this->postData)>0) {
          if($this->multiPartPost) {
            $PostString = $this->handleMultiPart();
          } else {
            $PostStringArray = Array();
            foreach ($this->postData AS $key=>$value) {
              if(empty($value))
                $PostStringArray[] = $key;
              else
                $PostStringArray[] = "$key=$value";
            }
            $PostString = join("&", $PostStringArray);
          }
          $this->headerLines["Content-Length"] = strlen($PostString);
        }
       
        foreach ($this->headerLines AS $key=>$value) {
          if ($this->debug) echo "$key: $value\n";
          $request .= "$key: $value\r\n";
        }
        if ($this->debug) echo "\n";
        $request .= "\r\n";
        if (count($this->postData)>0) {
          $request .= $PostString;
        }
    }
    $this->lastRequest = $request;


    for ($written = 0; $written < strlen($request); $written += $fwrite) {
      $fwrite = fwrite($socket, substr($request, $written));
      if (!$fwrite) {
        break;
      }
    }
    if ($this->debug) echo "\n";
    if ($socket) {
      $line = fgets($socket);
      if ($this->debug) echo $line;
      $this->lastResponse .= $line;
      $status = substr($line,9,3);
      while (trim($line = fgets($socket)) != ""){
        if ($this->debug) echo "$line";
        $this->lastResponse .= $line;
        if ($status=="401" AND strpos($line,"WWW-Authenticate: Basic realm=\"")===0) {
          fclose($socket);
          return FALSE;
        }
      }
    }
    return $socket;
  }
  
  /**
  * Make an file call to $url with the parameters set by previous member
  * method calls. Send all set headers, post data and user authentication data.
  * Returns the requested file as a string on success, or false on failure.
  **/
  public function file_get_contents($url) {
    if(file_exists($url)) // local file
      return file_get_contents($url);
    $file = '';
    $socket = $this->fopen($url);
    if ($socket) {
        while (!feof($socket)) {
          $file .= fgets($socket);
        }
    } else {
        Yii::log('Browser Emulator: file_get_contents bad socket', CLogger::LEVEL_ERROR);
        return FALSE;
    }
    fclose($socket);


    if(strstr($this->lastResponse, 'Content-Encoding: gzip') !== FALSE) {
      if(function_exists('gzinflate')) {
        $file = gzinflate(substr($file,10));
        if($this->debug) echo "Result file: ".$file;
      }
    }


    return $file;
  }


  /**
   * Simulate a file() call by exploding file_get_contents()
   **/
  public function file($url) {
    $data = $this->file_get_contents($url);
    if($data)
      return explode('\n', $data);
    return False;
  }
 
  public function getLastResponseHeaders() {
    return $this->lastResponse;
  }
}

 

 

实例:

 

$be = new BrowserEmulator();

$output = $be->file_get_contents("http://tvbinz.net/rss.php");
$response = $be->getLastResponseHeaders();

echo $output;

 

 

 

来源: http://code.google.com/p/torrentwatch/source/browse/branches/yii/protected/components/downloadClients/browserEmulator.php?spec=svn780&r=780

 

 

关联:

PHP获取远程文件内容

 

 

 

function curl_get_contents($url)
{
	$dir = pathinfo($url);
	$host = $dir['dirname'];
	$refer = $host.'/';

	$ch = curl_init($url);
	curl_setopt ($ch, CURLOPT_REFERER, $refer);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
	$data = curl_exec($ch);
	curl_close($ch);
	
	return $data;
}

 

 

 

分享到:
评论

相关推荐

    OperaMobileWindows(浏览器模拟器)

    **Opera Mobile Windows(浏览器模拟器)** Opera Mobile Windows是一款专为Windows操作系统设计的浏览器模拟器,它能够让用户在个人电脑上体验与移动设备上相似的Opera浏览器功能。这款软件是Opera公司为了满足...

    微信浏览器模拟器.rar

    微信浏览器模拟器,解决只能使用微信浏览器打开的问题,方便调试与查看源代码。个人写的工具,希望对你有帮助

    手机浏览模拟器,电脑端模拟手机浏览器

    手机浏览模拟器是一种工具,它允许用户在个人电脑(PC)上模拟移动设备的浏览器环境,以便测试和查看网页在不同手机或平板电脑上的显示效果。这种模拟器通常被Web开发者、设计师和测试人员用来确保他们的网站在各种...

    cpp-基于文件特征的Android模拟器检测

    在实现过程中,开发者需要编写C++代码来遍历文件系统,读取和分析关键文件的内容。这通常涉及到文件I/O操作,字符串处理,以及条件判断。同时,为了提高检测的准确性和防止误报,通常需要结合多种特征进行综合判断。...

    Android Studio模拟器文件复制方法

    在Android开发过程中,有时我们需要将文件从电脑复制到Android模拟器进行测试或调试。...无论选择哪种方法,都请确保模拟器已启动,并且在必要时获取了root权限,以便能够访问和操作模拟器的文件系统。

    用VNC远程控制android模拟器

    ### 使用VNC远程控制Android模拟器 在当前的软件开发环境中,远程控制技术变得越来越重要,尤其是对于Android应用开发者来说,能够远程访问和控制Android模拟器可以极大地提高工作效率。本文将详细介绍如何通过VNC...

    Android 模拟器如何访问本地网页和获取浏览器

    ### Android 模拟器如何访问本地网页和获取浏览器 #### 一、准备工作 为了实现Android模拟器能够访问本地网页并获取浏览器的相关信息,我们需要完成一系列的准备工作。 **1. 安装Apache HTTP服务** 首先,确保您...

    IE5\6\7\8\9 11等版本模拟器

    IE5\6\7\8\9 11等版本模拟器

    jar文件电脑使用模拟器

    - **导入JAR文件**:将JAR文件复制到模拟器的指定目录,或通过模拟器的文件管理器进行导入。 - **启动应用**:在模拟器中找到并点击JAR文件,启动应用,进行功能测试和性能评估。 5. **注意事项** - **兼容性**...

    chrome版本模拟器

    总之,Chrome版本模拟器是开发者处理浏览器兼容性问题的重要工具,通过它,我们可以更加高效地调试和优化代码,确保应用或网站能在各个版本的Chrome中稳定运行。同时,配合合理的编程策略和测试手段,能够进一步提高...

    PS2模拟器用BIOS文件

    **PS2模拟器及其BIOS文件详解** PlayStation 2(简称PS2)是一款由日本的...了解BIOS文件的性质、用途以及合法获取和使用的方法,是成功运行PS2模拟器的关键。在享受游戏的同时,也要尊重知识产权,合法使用相关资源。

    安卓模拟器sdcard.img文件打开修改

    在Android开发或者测试过程中,有时候我们需要对安卓模拟器的存储空间进行操作,比如添加应用程序、媒体文件等。这时,就需要了解如何打开并修改安卓模拟器的`sdcard.img`文件。`sdcard.img`文件是Android模拟器中...

    PS2模拟器必备组件bios完整版

    "IT猫扑网_百度搜索.url"可能是一个链接,指向关于PS2模拟器或BIOS的更多信息来源,虽然在这里并未提供具体的内容,但通常这样的链接会引导用户找到相关的教程或下载页面。而"说明.txt"很可能是包含有关如何安装、...

    eve模拟器包含镜像文件

    eve模拟器包含镜像文件。可以导入华为、华三、思科等主流路由交换、防火墙等镜像

    电脑模拟器专属浏览器

    谷歌浏览器x86版,这个可以用于x8处理器,或安装在电脑模拟器里,雷电模拟器已测试,其他模拟器理论也行

    系统工具-文件下载-激烈NDS模拟器 2.5.zip

    【描述】"系统工具-文件下载-激烈NDS模拟器 2.5.zip" 的描述简单明了,强调了这是一个系统工具,主要功能是文件下载,即获取并安装激烈NDS模拟器的2.5版。"激烈NDS模拟器"这个名字暗示了这款软件可能具有高性能和...

    neogeo模拟器bios文件

    【neogeo模拟器Bios文件详解】 在电子游戏领域,模拟器是一种软件,它能够复制特定硬件平台的功能,使得用户可以在个人计算机或其他设备上运行原本为该平台设计的游戏。"neogeo模拟器"是专门为了模拟Neo Geo游戏机...

    android模拟器安装APK文件

    通常,APK文件可以通过以下方式获取: 1. 开发者编译:如果你是应用开发者,你可以通过Android Studio将你的项目编译成APK。 2. 下载:从可信赖的第三方应用市场或开发者的官方网站下载APK文件。 3. 拷贝:从已安装...

    安卓GBA模拟器BIOS文件

    描述中的内容重复了标题,没有提供额外信息,但我们可以从中理解到该文件可能是一个专门针对安卓GBA模拟器的BIOS文件,用于支持模拟器正确运行GBA游戏。 标签“安卓GBA模拟器BIOS文件”进一步强调了这个文件与...

    via浏览器,安卓模拟器神器

    一款安卓浏览器,可以自定义ua的移动端浏览器,可以模拟器使用,可以注入js

Global site tag (gtag.js) - Google Analytics