! Aware > under Open source license > Activity specific > Information Tools > WWW > Robots and Proxies >

WWW robots and proxies

Automatic processing of HTTP requests
Subsets on this page: - #Apps & Utilities - #Libs & Functions -
Change Selections: Use Defaults. - #Personalize -
     icon
Search ! Aware:



     Home
  By TONY
  By MARK
  By JERRY
  By ANN
  By ERICA

Search all pages


Subjects

By activity
Professions, Sciences, Humanities, Business, ...

User Interface
Text-based, GUI, Audio, Video, Keyboards, Mouse, Images,...

Text Strings
Conversions, tests, processing, manipulation,...

Math
Integer, Floating point, Matrix, Statistics, Boolean, ...

Processing
Algorithms, Memory, Process control, Debugging, ...

Stored Data
Data storage, Integrity, Encryption, Compression, ...

Communications
Networks, protocols, Interprocess, Remote, Client Server, ...

Hard World
Timing, Calendar and Clock, Audio, Video, Printer, Controls...

File System
Management, Filtering, File & Directory access, Viewers, ...



Applications and Utilities: Showing

Dead Link Check (DLC) - DLC - HTTP link checker written in Perl. Can generate HTML output for easy checking of results and process a link cache file to hasten multiple requests. Initially created as an extension to Public Bookmark Generator (PBM); can be used alone. {(L)GPL}

At Sourceforge ( Production/Stable)

site-dater.pl - Generates a table of web links within a local hierarchy sorted by date. {PD}

(Info at freshmeat)

SiteMap - Creates an HTML SiteMap of your *.*htm* files {GPL}

(Info at freshmeat)

ht://Dig - Complete world wide web indexing and searching system {GPL}

(Info at freshmeat)
- The ht://Dig system is a complete indexing and searching system for a domain or intranet. This system is not meant to replace the need for powerful internet-wide search systems like Lycos, Infoseek, Google and AltaVista. (At Sourceforge)

Checklinks - HTML link checker that supports SSI, many Apache options, and more (in Perl 5) {OpenSource}

(Info at freshmeat)

PerlLeech - The program will given a set of keywords and file extensions go out to a set of search engines and search for files and download these. You will be able to specify the maximum recursive page downloads. {(L)GPL}

At Sourceforge ( Planning)

Web Resource Application Framework - Wraf implements a RDF API that hopes to realize the Semantic Web. The framework uses RDF for data, user interface, modules and object methods. It uses interfaces to other sources in order to integrate all data in one enviroment, regardless of storage f {(L)GPL}

At Sourceforge (Alpha)

mebay - MeBay is a Perl/GTK client for eBay with support for "My eBay" bid and watch items, and support for several types of item searching. Item images can also be displayed when possible. {(L)GPL}

At Sourceforge ( Alpha)

Lucrezia cover traffic system - Simulates the behaviour of a human Web surfer by downloading pages, filling in forms, etc. and leaking realistic "personal information" to prevent marketers and other snoopy persons from tracking the behaviour of real human users. {oss}

At Sourceforge ( Alpha)

Yet Another Ticker - YAT is another ticker program. However, it uses Yahoo's excellent and comprehensive news feed to create a ticker that can be read throughout the day. {(L)GPL}

At Sourceforge ( Beta)

Headlines - Headlines is an application to combine all the Internet news feeders in one place. {(L)GPL}

At Sourceforge ( Alpha)

DelphiRSS - DelphiRSS is a set of native Delphi VCL components which allow you to write applications that use and display RSS headlines. It incorporates a RSS parser and a component to retrieve RSS files via HTTP {oss}

At Sourceforge ( Beta)

KOffle - KOffle is a tool for managing your wwwoffle-spools (http and ftp) and your outgoing requests. {(L)GPL}

At Sourceforge ( Beta)

notify - Notify (website) visitors of changes to your site. {GPL}

(Info at freshmeat)

CheckURL - Sends notification e-mails for changed URLs {GPL}

(Info at freshmeat)

DejaSearch - DejaSearch is a frontend to DejaNews, the leading Usenet archive {GPL}

dejasearch-1.8.6 - A frontend to DejaNews for searching Usenet archives (At FreeBSD Ports)
(Info at freshmeat)

Web Secretary - Web page monitoring software {GPL}

(Info at freshmeat)

netcomics - A perl script that downloads today's comics from the Web {GPL}

(Info at freshmeat)
Netcomics - A modularized program that downloads comic strips from the 'Net that are updated regularly. It can be used to download any time-based image that is updated periodically. (At Sourceforge)

sitecopy - Maintain remote copies of locally stored web sites {GPL}

sitecopy-0.11.4_2 - Maintains remote websites, uses FTP or WebDAV to sync up with local copy (At FreeBSD Ports)
sitecopy-0.10.15 - utility for synchronizing remote and local web sites (At NetBSD packages collection)
(Info at freshmeat)

DraE Tracking - Allows servers to provide free tracking to web sites {GPL}

(Info at freshmeat)

FastLink - FastLink is a free Java Applet that displays mirror sites sorted by their respon {GPL}

(Info at freshmeat)

The Internet Junkbuster - The Internet Junkbuster v2.0.2 {GPL}

EHeadlines - Root Menu news system. {x,GPL}

(Info at freshmeat)

gtkMeat - A Freshmeat new submissions ticker {x,GPL}

(Info at freshmeat)

gtkSlash - Gtk+ based Slashdot headlines news ticker {x,GPL}

(Info at freshmeat)

Kget - KDE app to get files from the internet {x,GPL}

asScotch - The days UserFriendly comic strip in your AfterStep rootmenu {x,GPL}

(Info at freshmeat)

asTequila - The AfterStep Resource Page (TARP) headlines in your AfterStep rootmenu {x,GPL}

(Info at freshmeat)

Squid - High performance Web proxy cache {GPL}

squid-2.4.1 - Post-Harvest_cached WWW proxy cache and accelerator (At NetBSD packages collection)
(Info at freshmeat)

w3mir - HTTP copying and mirroring program {Artistic}

w3mir-1.0.10 - All-purpose HTTP copying and mirroring tool (At FreeBSD Ports)
(Info at freshmeat)

WWWOFFLE - Simple proxy server with special features for use with dial-up internet links {GPL}

wwwoffle-2.7b - A caching proxy server for HTTP and FTP designed for dial-up hosts (At FreeBSD Ports)
wwwoffle-2.5e.tgz - WWW OFFLine Explorer (At OpenBSD 2.8_packages i386)
wwwoffle-2.5e.tgz - WWW OFFLine Explorer (At OpenBSD 2.8_packages sparc)
wwwoffle-2.6c - WWW proxy with support for offline browsing (At NetBSD packages collection)
(Info at freshmeat)

freshmeat newsletter to HTML converter - procmail filter to convert freshmeat email newsletter to HTML {Artistic}

(Info at freshmeat)

webcrawl {PD}

webcrawl-1.10 - Download web sites without user interaction by following links (At FreeBSD Ports)
(Info at freshmeat)

ECLiPt-Mirror - Full-featured mirroring script {GPL}

pavuk - Webgrabber with an optional Xt or GTK GUI {GPL}

(Info at freshmeat)

snarf - Command-line URL retrieval tool with some unique features. {GPL}

snarf-7.0 - Another small command-line URL (http/ftp/gopher/finger) fetcher (At FreeBSD Ports)
(Info at freshmeat)

ticker - Configurable text scroller, with slashdot and freshmeat modules {GPL}

(Info at freshmeat)

curl - Tiny command line client for getting data from a URL {GPL}

curl-7.9.6 - Non-interactive tool to get files from FTP, GOPHER, HTTP(S) servers (At FreeBSD Ports)
curl-6.5.2.tgz - get files from FTP, GOPHER, HTTP or HTTPS servers (At OpenBSD 2.7_packages i386)
curl-7.3-kerberos.tgz - get files from FTP, GOPHER, HTTP or HTTPS servers (At OpenBSD 2.8_packages i386)
curl-7.3.tgz - get files from FTP, GOPHER, HTTP or HTTPS servers (At OpenBSD 2.8_packages i386)
curl-6.5.2.tgz - get files from FTP, GOPHER, HTTP or HTTPS servers (At OpenBSD 2.7_packages sparc)
curl-7.3-kerberos.tgz - get files from FTP, GOPHER, HTTP or HTTPS servers (At OpenBSD 2.8_packages m68k)
curl-7.3.tgz - get files from FTP, GOPHER, HTTP or HTTPS servers (At OpenBSD 2.8_packages m68k)
curl-7.3-kerberos.tgz - get files from FTP, GOPHER, HTTP or HTTPS servers (At OpenBSD 2.8_packages sparc)
curl-7.3.tgz - get files from FTP, GOPHER, HTTP or HTTPS servers (At OpenBSD 2.8_packages sparc)
curl-7.7.1 - client that groks URLs (At NetBSD packages collection)
(Info at freshmeat)
- Curl is a tool for transfering files with URL syntax, supporting FTP, HTTP, HTTPS, GOPHER, TELNET, DICT, FILE and LDAP. Curl supports HTTP POST, HTTP PUT, FTP uploading, HTTPS certificates, HTTP form based upload, proxies, cookies, user+password authen (At Sourceforge)

swebget - Prints a webpage to stdout {GPL}

(Info at freshmeat)

GNU Wget - Network utility to retrieve files from the World Wide Web {GPL}

(Info at freshmeat)

PathFinder - A personal web search engine {GPL}

(Info at freshmeat)

HTTPGate - A Filtering HTTP Gateway {GPL}

(Info at freshmeat)

Muffin - Filtering proxy server for the World Wide Web written entirely in Java {GPL}

(Info at freshmeat)
muffin - Muffin is a World Wide Web filtering system written entirely in Java that can filter any HTTP data sent and received by your web browser. (At Sourceforge)

tinyproxy - A small, lightweight, easy-to-configure HTTP proxy. {GPL}

tinyproxy-1.4.3 - A small, efficient HTTP proxy server (At FreeBSD Ports)
(Info at freshmeat)
- tinyproxy is a GPLed, lightweight HTTP proxy. Designed from the ground up to be fast and yet small, it is an ideal solution for sites where a full featured HTTP proxy is required, but the system resources for a larger proxy are unavailable. (At Sourceforge)

Internet Junkbuster - Blocks unwanted banner ads and protects your privacy {GPL}

(Info at freshmeat)

Kticker - News ticker widget that downloads news headlines and displays them periodically {x,GPL}

(Info at freshmeat)

urlredir - URL redirector for use with the squid proxy server {GPL}

(Info at freshmeat)

DailyUpdate - Grabs dynamic information from the internet and integrates itinto your webpage {GPL}

(Info at freshmeat)

Web User Interface - Builds a list of all available personal homepages. {GPL}

(Info at freshmeat)

CGIProxy - Anonymizing, filter-bypassing HTTP proxy in a CGI script (in Perl) {OpenSource}

(Info at freshmeat)

Get Right - HTTP resume for failed transfers. {GPL}

(Info at freshmeat)

Web Tree Scanner - A program to visualize the tree of a WWW server and check the links [X] {GPL}

Slashdot Reader - Slashdot Reader written in Pike/GTK. [X] {PD}

httptunnel-3.3 - Tunnel a tcp/ip connection through a http/tcp/ip connection

At FreeBSD Ports
httptunnel-3.0.tgz - HTTP tunneling utility (At OpenBSD 2.7_packages i386)
httptunnel-3.0.3.tgz - HTTP tunneling utility (At OpenBSD 2.8_packages i386)
httptunnel-3.0.tgz - HTTP tunneling utility (At OpenBSD 2.7_packages sparc)
httptunnel-3.0.3.tgz - HTTP tunneling utility (At OpenBSD 2.8_packages m68k)
httptunnel-3.0.3.tgz - HTTP tunneling utility (At OpenBSD 2.8_packages sparc)
httptunnel - Creates a two-way data tunnel through an HTTP proxy

asGin - Linux Today headlines in your AfterStep root menu [X] {GPL}

urlmon - URL monitoring and report tool {GPL}

Applications and Utilities

Others not displayed here
Full List

Libraries and Components: Showing

python web library - Powerful, lightweight web library for python. These modules emulate the Request and Response objects from ASP, and Sess, Auth, Perm, and UserSess from PHPLIB. The sensible alternative to Zope! :) {oss}

At Sourceforge (Alpha)

curl_version - Return the current CURL version

At PHP: Manual

curl_close - Close a CURL session

At PHP: Manual

curl_exec - Perform a CURL session

At PHP: Manual

curl_setopt - Set an option for a CURL transfer

At PHP: Manual

curl_init - Initialize a CURL session

At PHP: Manual

HTTP::Status - Processes status codes sent over HTTP, e.g. "403 Forbidden", "4040 Not Found", or "402 Payment required". Part of the libwww bundle. [Perl] {oss}

At CPAN

LWP::RobotUA - Create your own Web robot. Part of the libwww bundle. [Perl] {oss}

At CPAN

WWW::Robot - A traversal engine for your Web robot. [Perl] {oss}

At CPAN

WWW::RobotRules - Nice Web robots, as they scour the Net for treasure, heed a robots.txt file if they find one. Information about the Robot standard can be found in http://info.webcrawler.com/mak/projects/robots/norobots.html. [Perl] {oss}

At CPAN

ARS - A Web client for Remedy's ARS system. Useful only if you're already using ARSPerl. [Perl] {oss}

At CPAN

Libraries and Functions

Others not displayed here
Full List

Related Subjects (under Open source license)

(The following links to subjects at this site retain your personalized selections.)

WWW Servers - Respond to HTTP requests

WWW authoring - Creating HTML, CGI

WWW Browsers - User interface for accessing the WWW

Up to: World Wide Web - HTTP, HTML, standards, browsers, transfer utilities, servers, et al.

(There may be additional related subject pages listed here)

Personalized Selections
Platform:
MS Windows.
Unix/BSD/Linux.
X.
Prog.Language:
Java.
C/C++.
Perl.
PHP.
License:
Open-source.
  Artistic.
  Public Domain.
  GPL or LGPL.
Maturity:
Pre-production.
Stable.
Tip: To exclude choices, select all others in same column
Pre-Selections

Use our system: Bring Rapid Knowledge Transfer and Awareness to your company website!



Rapid-Links: Search | About | Comments | Submit Path: RocketAware > Activity specific > Information Tools > WWW > Robots and Proxies >
RocketAware.com is a service of Mib Software
Copyright 2002, Forrest J. Cavalier III. All Rights Reserved.
We welcome submissions and comments