« Chops | Main | Oxford Project Video »

Sunday, 23 November 2008

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00df351e888f88340105361278e4970b

Listed below are links to weblogs that reference Data Mining for Dummies:

Comments

The words "not field relevant" in photozone.de's lens tests say a huge amount that we miss out on when looking at reviews.

Ctein,

I don't find your article clearly explained. It reminds me of the lack of clarity on the www.dxomark.com pages. I like to see graphs fully explained with meaningful labels on the axis and a meaningful title.

Tendering all respect due, Ctein, I humbly posit the theory that you think too much...

best wishes


Does no one today just take a camera for test drives and simply take some pictures?

Dear Ken,

Not all my columns, and not all web sites, are intended for novices. If you don't understand the dxomark's web pages, you shouldn't be reading them. Seriously. Making constructive use of that information requires a certain minimum amount of technical savvy. If you can't suss out what they're saying, you are better off skipping it.

pax / Ctein

Dear Greg,

Au contraire, mon ami; others think far too little.

--------

Dear Wee,

You've hit upon the vexing part of buying a camera today. With the demise of so many local camera stores, it has become very difficult for most folks to find a physical camera to play with before buying. Even folks like Mike and I often cannot get loaned cameras we would like to test by the manufacturers. More and more, the old (and still wise) advice of "drive before you buy" isn't a real possibility. Potential buyers are forced to turn to third-party test data to figure out what camera might suit them best.

It is not a satisfactory state of affairs. I don't see any way to fix it, unfortunately.

pax / Ctein

WeeDram asked, "Does no one today just take a camera for test drives and simply take some pictures?"

The tests are still useful in helping set our expectations and removing the surprises that might crop up in situations in which we didn't get a chance to test.

I chose a camera that can be a bit noisy in low light even though I do much of my shooting in exactly those conditions. Thanks to lots of published tests, I made the choice knowing what to expect. The balance of things offered by the camera is the right one - the bright viewfinder, fast access to the controls like metering mode and focus mode, the general image quality and colour response, the price and the way it feels to use.

Published tests are useful (even really, really geeky tests). We're at risk of letting them becoming the be-all and end-all of the camera-buying process but that's being addressed very well in this article and others published here.

(I'm slowly losing my fear of digital noise, perhaps just through familiarity. It's only a matter of time before bold, unashamed and artful use of digital noise becomes mainstream, just the way similar use of film grain eventually did. Mr Johnston can then quite rightfully claim to have been singing its praises for years…)

Yes, agreed Greg.

When I browse DPR forums looking for NICE PHOTOS from say the G10, every post tends to begin with "here are a few test shots," "small comparison," and along those lines. See I care very little about people's personal fiddling with their new toy and I wish the 'newness' (if that is a word) would wear off sooner so I could start looking at some real photography coincidentally with a camera I am interested in.

I would have expected some serious scatter over the past five years as that is the period in which some makers (Nikon, Canon, and to some extent, Kodak) were producing cameras of serious sophistication while others were just getting involved in the technology. [See the whole Leica M8 debate.] As the others catch up, I would expect dispersion to narrow, as, in fact, these graphs seem to demonstrate. I would be very interested to see a trend line on the pixel-pitch graph which eliminates all models but the high-end machines from Nikon, Canon and Sony (Minolta). I wouldn't be surprised if that were very narrow indeed -- or perhaps a new line that correlates closely with the maker's intention, high ISO or high-res. If you had two trend lines, one showing high ISO/noise effectiveness, and one showing high-resolution/noise effectiveness, I suspect that would show what we all believe, that is, that bigger pixels (D3) have less noise at higher ISOs, and smaller pixels (1DsIII, A900) have more noise and better resolution, and that everything is getting somewhat better over time. Which is to say that DxO has proven what any high school kid who just signed on to Digital Photography Review could have told you.

JC

"If you can't suss out what they're saying, you are better off skipping it." OUCH!

Anyway, to the extent I understand the dxomark graphs I am amazed how well my little el cheapo, lightweight, high synch speed, 6MP Nikon D40 stands up compared to the grownup cameras.

"Data Mining for Dummies" but not novices ?

Y

Thanks for nailing down some interesting ideas. I appreciate your clear thinking!

I can remember when cameras were for making photographs, not charts and graphs.

Hi Ctein
An interesting article.
My initial thought were its a long time since I did any statistics, but I wonder how they did the regression analysis (drew the line) on the camera date vs SNR graph; without the D3 (2007 39) the line gets a lot steeper, I'd say closer to 2db - but that's based on me holding a biro to the screen and eyeballing it...

But then they do draw this out later by plotting different sensors sizes in different colours:
http://www.dxomark.com/index.php/eng/Insights/SNR-evolution-over-time/SNR-and-image-quality-evolution
which is more informative, and probably a better way of displaying the data.

Regarding SNR and how software has helped to reduce this: Where you have some control over the parameters, as with cameras that shoot RAW, this allows for a choice- one mans noise is another's Pointillism - then this is good.
But its a shame that all cameras - even the little point and shoots - don't offer RAW. For example there are hacks for Canon's Ixus (Powershot in USA) that allow you to record the RAW (CRW) file. I was amazed at the amount of processing - even at ISO 80 - and how weird the processed image looked compared to the RAW image. They hit the image with what you could mimic in Photoshop by going crazy with Dust and Scratches and then Fade to Darken.

Regards, Tim

ahaa......... I have more fun with this......
http://www.artouko.com/bum/

I have a couple of problems with the DxO data.

The D3 highlights the first of what I think is a problem with the DxO results. DxO reports on RAW output. And the D3 outputs very clean RAW files.

In fact, they are too clean, I think. How can a sensor output a true RAW file that has virtually no chroma noise, but plenty of luma noise? The answer - I don't think it can.

Nikon, I think - with the emphasis on "think" - is processing away its sensor chroma noise before its final RAW output.It isn't possible to have a CMOS chip that naturally outputs no chroma noise, is it?

So, I am not sure there is a level playing field for the Dxo data - pre-RAW output processing will give much higher marks. Yet, the D3 output at high ISO's may not deliver as good an image at higher ISO's than a camera whose output is processed in an outboard program.

My second problem with the DxO data is that I don't think - and I really could be off-base here - they have specified how their weightings and algorithm are calculated. Too much black box feeling for me, at least for now.

Thanks for another provocative article, CTEIN> :)

Gingerbaker said, "My second problem with the DxO data is that I don't think - and I really could be off-base here - they have specified how their weightings and algorithm are calculated. Too much black box feeling for me, at least for now."

I have to agree. Particularly regarding the 'sensor number'. I can't find a formula for this

Fascinating. Utterly fascinating.

Its like reading Road and Track where test drivers earnestly compare various automobiles. Its like reading Cycle magazine where they flog various motorcycles around a track and time the laps.

Friends and I being high tech engineers by trade created a similar situation a few years back. We performed a ton of camera/lens tests for 120 and 4x5/8x10inch formats. Learned a lot, actually. I measured and calculated and considered and published... and... and... even with all that concrete measurable information and knowledge, my photographs failed to improve.

It was only after I put aside the pursuit of the unimportant to concentrate on learning _how_ to make images that galleries showed in shows, that awards were awarded, and publications started publishing my new work.

The biggest lesson I learned is this; if one's goal is to create wonderful images, testing and measuring simply does not help.

For some people, measuring and testing becomes a convenient distraction.

Dear Yanchik,

"DMfD" = "TiC"

("TiC" = "tongue-in-cheek")

Data mining is never for dummies; dummies who go data mining come back with fools gold every time.

---------------------

Dear John,

Interesting ideas, but unfortunately there are few enough data points that parsing them the way you suggest won't allow for an analysis. You could well turn out to be right, but for now it remains in the category of a bar bet rather than solid purchase information.

I'm not convinced that the technology has stabilized. In fact, that's going to be the topic of my already-written Solstice column. You have anticipated me!

Some characteristics have: in the early part of the decade you could find cameras of comparable pixel counts that had a factor of two or more difference in the number of spatially-resolved pixels in the image they produced. Nowadays performance is pretty consistent; if you assume that your real spatial resolution is about 40% of the pixel count, you're going to be close. Other qualities haven't converged, and I don't know when they will.

More bar bets I think.


~ pax \ Ctein
[ Please excuse any word-salad. MacSpeech in training! ]
======================================
-- Ctein's Online Gallery http://ctein.com 
-- Digital Restorations http://photo-repair.com 

Dear Tim,

I also have intuitive doubts about the robustness of some of the trendlines, but that would be a topic for another column, and not important to the topic I was writing about, so I didn't much go into it. Statistics of small numbers is such a dicey area...


~ pax \ Ctein
[ Please excuse any word-salad. MacSpeech in training! ]
======================================
-- Ctein's Online Gallery http://ctein.com 
-- Digital Restorations http://photo-repair.com 
======================================

Dear Ginger,

If you dig through the technical papers on their website, you'll find mentioned the important fact that what is being measured are not RAW files but RGB files created from RAW. As they note, in some characteristics improvements in RAW conversion algorithms over the past decade have swamped improvements in the sensors themselves.

Also, RAW data is not the photoelectric signal collected by the sensor; it is highly massaged output data. You never get to see what the sensor collects, and you wouldn't want to. There needs to be lots of internal signal processing going on.

If you go back and reread my review of the Fuji S100fs, you'll see some speculation in there by me and DDB about the differences in internal signal processing between our Fuji and Nikon cameras.

Not that this really matters; you get what you get. You're buying a computer system, as it were, not individual components. Remember that these tests are supposed to be useful, not merely technical. Measuring a pre-processor sensor characteristic that the photographer can never, ever get access to is only interesting to designers.

All benchmarks are very black box; you just probably never noticed it before. Quick, tell me exactly how Kodak computes graininess these days, or Pop Photo computes SQF.

Or for that matter exactly how Modern Photography ran their lens tests back in the '80s, which were a heck of a lot more transparent! Which makes a point of importance: I would always get resolution figures substantially higher than MP's--like 30 to 50% higher! Way way beyond what should be normal experimental differences. Peter Moore and I talked about this; we never could figure out what we were doing differently. We also didn't think it really mattered; the important thing about a benchmark is that it produces consistent results across time and camera models, not that it be directly comparable to anyone else's benchmark.

So even if you knew what was under the hood, it wouldn't do you any good. The real thing you have to decide is whether the results are useful to you. If they are, it doesn't matter. If they're not…well, it also doesn't matter what's under the hood!

~ pax \ Ctein
[ Please excuse any word-salad. MacSpeech in training! ]
======================================
-- Ctein's Online Gallery http://ctein.com 
-- Digital Restorations http://photo-repair.com 
======================================

Gingerbaker said: "It isn't possible to have a CMOS chip that naturally outputs no chroma noise, is it?"

Actually, that's the very nature of it.

With CCD sensors, noise reduction is performed by a separate processor, and most cameras provide raw images after the CCD readout, but before the noise reduction processing.

With CMOS sensors, noise reduction circuitry is built into the sensor itself, so the output from the sensor is post-noise reduction, and that's what the raw image stores, with no way of skipping that step.

Steven S

I am on thin ice here, but I think you may be mistaking electronic noise subtraction at the sensor level in CMOS chips for actual noise reduction done downstream.

I have never seen photographs from a CMOS sensor (or, for that matter a CCD sensor) which did not display BOTH chroma and luma noise. That is, until the D3 came out, and lo and behold, most if not all of the chroma noise is gone and only luma noise remained.

Nikon does a great job of this, however, The processing is super fast, and they have done a very good job of maintaining detail and color accuracy.

But, it is also, inescapably (I think), post processing and not an inherent characteristic of the sensor itself.

Until the D3, I think it is a fair statement to say that post processing done on board the camera has always been inferior to the result one can produce by outboard programs and a big CPU.

I have read some posts, but have NO first-hand knowledge, that this trend continues, despite the fine results from the D3. That is, one may get better final results from a different approach.

And so I question the DxO results, which claim to offer us information on the physical qualities of the sensors themselves.

As Ctein points out, there really is no thing as a pure sensor output, we can only read the output from what actually is a small computer system in each camera.

I am still uncomfortable about the objectivity of the DxO results for these reasons, and, combined with the proprietary nature of the evaluations, find no satisfaction from their availability. Especially since some users have pointed out seemingly nonsensical results in the data.

I also think that the DxO data is almost impossible, at least for me, to use as a basis for actually making a buying decision.

I am thinking about getting a small camera to carry with me, instead of my forty pound bag of DSLR's and lenses and kitchen sink. I have been following the on-line discussions and viewing samples of photographs from Canon G10 and Panasonic LX3 users. Michael Reichmann even had a serious post on the relative capabilities of the G10 vs a medium format digital back, and found nearly identical results under certain conditions when prints of a certain size were viewed. Amazing progress in small cams and the output from their sensors!

But, go to DxO and compare the results of a G10 to any DSLR. Would anyone in their right mind come away from the DxO graphic comparison of a G10 and a DSLR and come to Michael Reichmann's conclusion? I submit the answer would be a resounding "no". And hence, my reluctance to view the DxO results as useful to me.

The comments to this entry are closed.