storing images in databases, question

Astaroth

Free Member
Aug 24, 2005
3,985
278
London
Your script can look at the HTTP_REFERER (sic) header to make sure your image is not being linked to from another page. But if you are using Apache you can let the web server do this in mod_rewrite.
The OP will be using a handler or "script" to serve up the image irrespective of if its in the database or a central file server. It therefore can check for hot linking irrespective of the where the image is stored.

If you even look at Microsofts own solutions built around SQL server (eg SharePoint 2010) it stores the files themselves in normal disk space and the SQL simply holds the meta data and the location of the file(s).

I'm also slightly confused and worried about people thinking that a file system is difficult to back up!

In this scenario the processing overheads involved retrieving a few smallish (64kb) images (per request) from the DB shouldn't be of much concern.
Whilst a single image wouldn't be a massive load it could very quickly add up. Ebay has been mentioned by someone else in the chain so if you take that as a model. On every page served you have 100 images that are each individually being pulled from the db. Then you get 30 concurrent users and you then have 3000*64kb images being pulled from blobs in the SQL.
 
Last edited:
Upvote 0

KM-Tiger

Free Member
Aug 10, 2003
10,346
1
2,893
Bexley, Kent
If you're on linux, just symlink the image from the real dir to the other websites dir.

(Wait: I haven't even tried that myself, I'm not sure that would work, but I don't see why not)

OT: Yes it will, but two things could go wrong:

The setting of FollowSymLinks in Apache (if it's Apache)
File permissions depending on the user running the apache process and the file's owner.

Big risk of setting up something very hard to maintain.
 
  • Like
Reactions: stugster
Upvote 0
Your script can look at the HTTP_REFERER (sic) header to make sure your image is not being linked to from another page. But if you are using Apache you can let the web server do this in mod_rewrite.

HTTP_REFERER will help you stop other sites LINKING to you images but it won't help you stop them DISPLAYING them in-line.

As with Apache IIS7 has has a URL Rewrite module that will allow you to block (or redirect) Requests based on the HTTP_REFERER header.

Regards

Dotty
 
Upvote 0
Avoid storing images in the database.

Why?

Because in most dynamic database driven sites, it is the database which is the bottleneck, so don't clog it up unecessarily.

Forget images for a moment, when it comes to scalability and optimised performance, it is not even a good idea to have data in the database!

That's why there are push cms architectures, and caching architectures, and why to increase Wordpress speed, you switch on caching, to avoid hitting the database. Caching in Wordpress pre-fetches the data out of the database and places it in, guess what, files.
 
  • Like
Reactions: stugster
Upvote 0

Cohesive Computing

Free Member
May 15, 2010
32
7
... but can you advice on the processing issues?

A Microsoft report here claims that Sql 2005 performs better than NTFS when processing small binary objects, and vice versa with larger objects.

The blob Filestream feature of Sql 2008 is claimed to provide a more consistent level of read performance across small and large binary objects, comparable to NTFS.

Whichever DB you use, storing binary objects for high volume web traffic is not the steep technical challenge that some are making out to be. A little web analytics information combined with even a simple (pre-)caching strategy will cope with scores of page requests per second.
 
Last edited:
  • Like
Reactions: awebapart.com
Upvote 0
T

TotallySport

The OP will be using a handler or "script" to serve up the image irrespective of if its in the database or a central file server. It therefore can check for hot linking irrespective of the where the image is stored.

If you even look at Microsofts own solutions built around SQL server (eg SharePoint 2010) it stores the files themselves in normal disk space and the SQL simply holds the meta data and the location of the file(s).

I'm also slightly confused and worried about people thinking that a file system is difficult to back up!


Whilst a single image wouldn't be a massive load it could very quickly add up. Ebay has been mentioned by someone else in the chain so if you take that as a model. On every page served you have 100 images that are each individually being pulled from the db. Then you get 30 concurrent users and you then have 3000*64kb images being pulled from blobs in the SQL.
Thanks for the info, I am glad you pointed out the handling of the sites that MrAnchovy seems to have missed.

The bottle neck is my issue, with internet speeds alot quicker now, chache isn't IMO the same benefit as it use to be when displaying web pages and is now a very small benefit compared to the dial up days.

based on your 30+ users calling 3000*64kb from a database, how would that compare to mapping them, either way the images will have to be processed and loaded, possibly the database is more under used processing part of the system so if I used them to handle the the images that would free up IIS processing?

I have no idea if the above statement is any where near slightly accurate but if you could clear the wild statement up it would be appriciated, although I would be hoping the system would be having more than 30+ visitors.
 
Upvote 0
T

TotallySport

A Microsoft report here claims that Sql 2005 performs better than NTFS when processing small binary objects, and vice versa with larger objects.

The blob Filestream feature of Sql 2008 is claimed to provide a more consistent level of read performance across small and large binary objects, comparable to NTFS.

Whichever DB you use, storing binary objects for high volume web traffic is not the steep technical challenge that some are making out to be. A little web analytics information combined with even a simple (pre-)caching strategy will cope with scores of page requests per second.
many thanks will look into that later
 
Upvote 0
The bottle neck is my issue, with internet speeds alot quicker now, chache isn't IMO the same benefit as it use to be when displaying web pages and is now a very small benefit compared to the dial up days.

based on your 30+ users calling 3000*64kb from a database, how would that compare to mapping them, either way the images will have to be processed and loaded, possibly the database is more under used processing part of the system so if I used them to handle the the images that would free up IIS processing?

I have no idea if the above statement is any where near slightly accurate but if you could clear the wild statement up it would be appriciated, although I would be hoping the system would be having more than 30+ visitors.
When you say cache in the same breath as dialup I'm guessing you are just referring to browser cache, which is client-side cache, the cache which is mainly reused by the same user. Don't forget there is the other server-side cache to think about too, the type of cache which can be reused by different users. It is this server-side caching which I'm talking about in my previous post.

Client-side caching relieves the bottlenecks of bandwidth, connection speeds, page load times. Server-side caching relieves the bottlenecks of the database and server processing, server performance and scalability.

For example, a page which requires stuff from the database, rather than get that stuff from the database each time the page is requested, get it once, save it to a file on the server, and then serve up that file for subsequent requests. That is an example of server-side caching.

In a lot of cases, it is the database which is the bottleneck. Why? Because of maximum concurrent database connections, each database connection takes memory, of which there is a finite amount. In a shared environment there's only a certain number of maximum concurrent connections allowed. Which means that if they get used up in a busy period, a page either returns a connection error, or just sits and waits for a connection to free up, possibly timing out if it has to wait too long.

Without this type of caching, there will be loads (loads as in workloads to process) on IIS and the database. With this type of caching there will be loads on IIS. But either way for images to be served to a browser from a server, IIS will be part of that equation, the image will pass through IIS's handling since the browser is only talking to IIS. Adding a database to the equation wont remove or relieve this (as in reduce the load).

If you really want to have your images in your database you can do this, but for performance and scalability you should server-side cache them into server-side files, so that the database isn't hit each time an image is requested. Server-side caching in this way, having a database but not using it real-time, is a feature of the solution you mentioned:
 
Upvote 0
T

TotallySport

When you say cache in the same breath as dialup I'm guessing you are just referring to browser cache, which is client-side cache, the cache which is mainly reused by the same user. Don't forget there is the other server-side cache to think about too, the type of cache which can be reused by different users. It is this server-side caching which I'm talking about in my previous post.

Client-side caching relieves the bottlenecks of bandwidth, connection speeds, page load times. Server-side caching relieves the bottlenecks of the database and server processing, server performance and scalability.

For example, a page which requires stuff from the database, rather than get that stuff from the database each time the page is requested, get it once, save it to a file on the server, and then serve up that file for subsequent requests. That is an example of server-side caching.

In a lot of cases, it is the database which is the bottleneck. Why? Because of maximum concurrent database connections, each database connection takes memory, of which there is a finite amount. In a shared environment there's only a certain number of maximum concurrent connections allowed. Which means that if they get used up in a busy period, a page either returns a connection error, or just sits and waits for a connection to free up, possibly timing out if it has to wait too long.

Without this type of caching, there will be loads (loads as in workloads to process) on IIS and the database. With this type of caching there will be loads on IIS. But either way for images to be served to a browser from a server, IIS will be part of that equation, the image will pass through IIS's handling since the browser is only talking to IIS. Adding a database to the equation wont remove or relieve this (as in reduce the load).

If you really want to have your images in your database you can do this, but for performance and scalability you should server-side cache them into server-side files, so that the database isn't hit each time an image is requested. Server-side caching in this way, having a database but not using it real-time, is a feature of the solution you mentioned:
Thanks for clearing that up
 
Upvote 0

Latest Articles