Is there a program that crawls a specified website and will spit out if there is a reference to another website? I have images,video files,pdf's,etc. that I need to give to another developer to finish the port over to their new server.
I just transferred an old site to another person and they are still using my files. I don't know 100% were all the files are and I want to be sure what files I need to give to them. It would be nice to have something like linkchecker that can crawl and if there is a reference to a website root (ex. sub.domain.com) then it will spit out information about it (what page, what is the url).
I don't want to block the site at this point from using the files so that is out.
I'm on a Mac so any terminal program would be just fine.
You could try Sitesucker which can be used to download all the files used on a site (and any it links to depending on the settings). It's OSX (and iPhone) donation-ware so that might be just what you're looking for. I believe it creates a log file of the files it downloads so you could send that if you just want to send the URL's to your colleague instead of the actual files.
You could check out wget. It can recursively (-r option) download a website and save its content to your harddisk. It usually (i.e. if not specified otherwise) downloads everything into directories named like the host.
But be careful not to download the whole internet recursively ;) So be sure to specify correct --domains or --exclude-domains options.
Related
I want to secure static files (images, .txt files) from unauthenticated users. How can I implement the user authentication to the website so that the static files in specific folder also get secured? I have used simple authentication in a login.asp file and started a session for authenticated user and I check the session value for protected .asp files. But I have no idea how to secure static content on Classic ASP website.
The website is hosted on IIS 7 with Integrated pipeline mode.
You already asked this, and I answered it, and I will give you the same answer.
You will need to use BASIC AUTHENTICATION to restrict access on static files in IIS (Classic ASP). Otherwise, you need to save the static content in another format and encrypt it and only make it viewable by people authenticated by your program.
Please don't ask this again, the answers will not be different.
If using Basic Authentification is not your cup of tea, one possibility would be to replace your static files with an ASP file that upon authorization, will output the correct file. If necessary, you can set the ContentType of the Response to the appropriate type. The link http://support2.microsoft.com/kb/173308 show you how to do that with an image stored inside a database but of course, you can take whatever you want as the source of the file. In the case of .TXT files, you can even directly take the file and simply add a small section of ASP code at the beginning for doing the check.
All of this required extra work. There is no way to simply activate some sort of protection with the session state for static files without extra work.
Old question but -- Most MS servers with Classic Asp installed have several default folders which cannot be accessed except via ASP. they are /bin /app_code /app_data and there may be others. It depends on your hosting company. Windows 10 IIS (their cut down dev & test suite) locks these by default. Using ASP code to retrieve and display text and html is very easy but I'm not sure how to do images. If you have very low traffic, one way would be to copy the image file to an unlocked folder and give it a random name, then access it normally in an IMG tag, then delete it after use. (I came here looking for a better method).
Update: The answer to loading images via ASP is here -- displaying images from sql database with classic asp ... see bottom answer by "HeavenCore" and, instead of Response.BinaryWrite rs("ImageBlob"), get the binary of the image into Your variable, eg: BinaryImageData and do Response.BinaryWrite BinaryImageData
Can you copy a Composite C1 website? I would like to create a copy of an existing website as a new website.
I start by creating Site A. Then I want to copy it and create Site B.
For example: copy the pages, functions, data, content, layouts, css from website A to website B. The only difference between the two would be the name.
It would infringe copywrites and may get you sued, but yes, its possible with a scraper, which basicly get all of the site, and download it to you, such things are used by google and search engines for a cache of sites.
Some exaples:
http://www.grepsr.com/?adwords2&gclid=CIe4rrPF57cCFURcpQodASIAgg
http://info.kapowsoftware.com/WebScrapingDefinitiveGuide.html?pi_ad_id=11920224743&gclid=CPCfxbTF57cCFWNNpgodnCQAKQ
http://scrapy.org/
or just google "web scrapers"
If you own the site however, and have access to the ftp, just simply copy the files to a folder called /b and it can become www.a.com/b or you can set up an addon domain to point to /b and make the addon domain.... say www.b.com
The answer to your question "can you copy a website?"
Is Yes....you can.
Provided you have access to all the files/folders, its no different then copying a bunch of folders on your computer, to another folder.
So if you're using a shared host....and everything is in your public_html folder.
Just put the whole website in one folder, then copy it over to another folder.
And then just simply point your new domain to that folder, through your hosting platform.
The process to do this is different for different hosts, but the actual answer to your question is...
YES....YOU CAN COPY A WEBSITE FROM ONE FOLDER TO ANOTHER
IF you have access to the files on the server you can simply copy it to the other desired location...
But remember you have to update links and other paths (if they are absolute).
If you don't have the access you could maybe use the developer tools like firebug, or using F12 on chrome or IE and copy each file and source code you have by hand. This approach is a little more time consuming than the last one but at least it can be made.
Cheers
As far as I know the easiest way would be use use Internet Explorers save to offline webpage function (if it is still there) - this will copy all the resources of the currently open webpage and recode the HTML to use them, as for an entire website..I dont think it will be easy, for legal reasons.
If it's your own site, sure why not! Who is there to stop you?
But if it's someone elses site, of course you have to worry about copyright and most of the time the website uses server side scripts which are not downloabeable.
You can duplicate a Composite C1 website by copying the entire file structure to a new folder and then update the installation id in the folder ~/App_Data/Composite/Configuration/InstallationInformation.xml (put in a new random GUID). Then point a new IIS site into this new folder.
If your site is using SQL Server as a backend you also need to create a copy of your database, create a new user account with dbo access for this database and update the connection string in ~/web.config.
If you wish to duplicate an entire page structure inside the existing instance of the CMS and share media files, templates etc. this could be done, but no tooling is available. This would be a coding task.
Copy the the directory(website physical path) where the website is pointing to and paste it somewhere...create a new website and point it to that copied directory....
I am just starting with Aptana and I don't have the original HTML files for my web site. Is there a way that I can import my whole web site as a project or do I have to open each page from with Aptana and save with the original urls?
Thanks
I use Interachy which is a commercial Mac option. One open source Windows program is HTTrack.
If your site isn't large, it's often feasible to go through page by page and save each one as source. You also need to save all the images and CSS files, and reconstruct the folder directories, though it's goes faster than you might think.
Good luck!
My customer is currently using MHT files for storing offline representations of browsed web pages. The files are saved and later viewed in Internet Explorer.
When viewing the files we would like to be sure there is absolutely no network activity to the original site or any other site - the content should be browsed 100% offline, and should not have any special "local" privileges as well (i.e. access to file:// protocols etc.). We would like to keep JS running if possible, and we can suffer to consequences of disabled features because of working offline.
We are willing to change the viewer or even the file format (and convert all old mht files as well) if a better solution is suggested.
Thanks for any help on this,
Udi
It is not possible to guarantee that there will be no network activity unless you go to offline mode in Internet explorer. Though the advantage of saving a web page to a mht file is that all the info for displaying the page (including images) are stored in one file instead of several files and folders, making archiving easy, if the web content has links to other pages, clicking on links will initiate network activity.
One option is to post-process the mht file and replace the url links with just the title of the link. For e.g, replacing
<A=20
title=3D"Conduction band"=20
href=3D"http://en.wikipedia.org/wiki/Conduction_band">conduction =
bands</A>
with "Conduction band".
I was wondering if anyone knows how to or if it is possible to upload files to a sharepoint (v3/MOSS) document library over FTP. I know it is possible with webdav. If it is possible is this even supported by Microsoft?
I don't think so. I think your options are:
HTTP (via the upload page)
WebDAV
Web Services
The object model
You can map a drive to a SharePoint document library, for example \\serveraddress.domain.com\Documents. So I would try mapping a drive on your FTP server, then making sure files that come in over FTP get sent to that drive.
Big edit: Have any of you figured out how to upload to SharePoint (WSS)? I've tried drive mapping and then using Robocopy and Synctoy to copy files thinking a tool might offer greater control (i.e. a Copy Date Modified control). As I understand it the files are actually stored in SharePoint as database objects and therefore SharePoint views display the database object (SQL object's) properties in Document Libraries where a new user would expect to see the file properties. Those file properties are still alive! They just need to be uncovered by a different view. I particularly like the mapped network drive view of a SharePoint Document Library. File attributes are pretty important to my team, so we were concerned about that at the start. As an opinion note though, the default view showing attributes that appear as incorrect is just plain annoying!
The best solution we've come up with for doing large file migrations into SharePoint is a mapped network drive then using a tool called FreeFileSync available at SourceForge to move your files and folders. It's great because it produces verbose error messages and give a lot of control, especially for the instances that SharePoint tries to block a particular filename or file extension.
Direct FTP into SharePoint is not one of your options. You would need to have a timer job run that checks your FTP directory and uploads into the document library.
Yes it is possible.
The WebDav Redirector allows you to access webdav resources (including Share Point) via UNC path, ie \yourspserver\site\doclib. The IIS FTP server accepts UNC paths as backing storage to virtual directories.
On your ftp server, right click the ftp site in the IIS Manager and select "Add Virtual Directory". Give it a name and specify the sharepoint unc path for the physical path. You'll need to set the "connect as" user to a domain user that has access to the sharepoint folder you're connecting to.
Connect to the ftp folder and you should be able to "cd" into the directory and put/get files without issue (just confirmed it myself). The only caveat is an age old bug/feature of IISFTP, that doesn't show a virtual dir in an ls/dir command listing. The fix is to create a physical folder that mirrors the virtual directory's location. For example, if your ftp root is c:\inetpub\ftproot, then you'll need to create a dir that matches the name of your virtual dir in this location. It will then show up in an ls/dir listing but the cd command will still move into the virtual dir, not the physical dir.
You can directly SFTP/FTP into your SharePoint doc library using Couchdrop. It turns your SharePoint into a native SFTP/FTP server, you can create additional users, etc. Sing out if you need assistance more than happy to assist.
Full disclosure: I represent Couchdrop