One of our writers and one of our readers got involved in a technical dispute recently that brings up an interesting issue. Let's say you have two imaging devices. Each one has eight settings. Here are the quality scores for each device at each setting, higher being better:
Device 1:
2 3 5 7 9 7 5 3
Device 2:
3 4 6 7 7 7 6 5
2 3 5 7 9 7 5 3
Device 2:
3 4 6 7 7 7 6 5
Your task is to assign one rank or grade for "image quality" to each device, expressed as an integer from 1 to 10. How would you grade each device?
Mike
Device 2: I'd prefer an overall "good" quality.
Posted by: Michael | Sunday, 08 February 2009 at 10:10 AM
Good question: The second one for consistency but there is not much in it.
Of course if I was looking for ultimate image quality then 1 gets it but for general use 2 does.
Posted by: Andrew | Sunday, 08 February 2009 at 10:12 AM
Depends on whether people expect to use most of the settings or will use a small range most of the time. If it's the type of device where people are most likely to use it at the 9 spot +/- one position (i.e. 797 in the first and 777 in the other), then, the first one might win. If they're likely to use it throughout, the higher total/more consistent (second one) likely wins.
That's assuming the scale is 0-10 and not, perhaps, -10 to 10, with 0 being the point at which something becomes "acceptable." Are things only 5 or higher acceptable? If so, again, the latter might win.
I don't think it's a valid question (as asked). You're basically asking a weaker version of "highly specialization" or "somewhat specialized," which is itself a bit of a weaker version of "specialized" versus "jack of all trades, master of none."
Posted by: iacas | Sunday, 08 February 2009 at 10:13 AM
TLI: don't what the settings are or what the numbers mean.
Is a one point difference meaningful: does it fall within the margin of error, is it logarithmic, exponential?
What are the settings? If setting 5 (or settings 4-6) are used most often by most people, then maybe Device 1 looks better most of the time.
Anyway, like I said, too little information.
Posted by: MBS | Sunday, 08 February 2009 at 10:15 AM
Easy decision:
Device 2 for overall performance.
Device 1 if you need setting 5 a lot.
Posted by: Aeneas | Sunday, 08 February 2009 at 10:17 AM
Device 1 has a total of 41. Device 2's total is 45. That tells me that Device 2, overall, produced better images, regardless of the 9 rating Device 1 received at a given setting.
In my mind the device that has better numbers overall is the better device and should get the higher rating.
Posted by: Tony In Knoxville | Sunday, 08 February 2009 at 10:18 AM
Device 1 has a mean score of 5.125, and a median score of 5. Device 2 has a mean score of 5.625 and a median score of 6. Device 1 outperforms Device 2 on only one test and ties on two tests. Device 2 outperforms Device 1 on 5 tests. Unless the test on which Device 1 outperforms Device 2 is a critical function (for instance if it is the setting that will be used the majority of the time), then Device 2 is clearly a better performer overall.
A clearer (though more personal) assessment could be made by weighting each of the 8 settings by order of importance. Then the score for each device would be multiplied by the weight and the scores could be averaged. This would give a clear picture of which device is the best performer in the most important areas.
Posted by: Tony McDaniel | Sunday, 08 February 2009 at 10:21 AM
Well, Device 2 rates equal/higher on all but one test so if each setting is equally weighted then it rates at a +4 (+6 for the higher scores and -2 for the one lower score).
Posted by: Michael Z | Sunday, 08 February 2009 at 10:25 AM
It depends on your use, I guess. I would choose device 1 if I knew I was going to be using setting 5 much of the time and settings 4-6 most of the time. On the other hand if I thought my usage would be spread across the available settings, device 2 might be preferable.
For instance if these settings were apertures, with settings 4-6 being f/5.6-f/11 then as a lansdscape shooter I would definitely choose device 1, since those are the apertures I use.
Posted by: Jeff | Sunday, 08 February 2009 at 10:28 AM
Depends on the distribution of usage of each setting.
Overall image quality = sum, for k=1 to k=8, of "percentage of use of setting k" * "quality of settings k"
Depending on which settings you use most frequently, you would prefer the device 1 or the device 2.
If I assume that the settings is actually lens aperture, I'd choose the device 2 if it was a normal or telephoto lens, and the device 1 if it was a wide-angle lens.
Posted by: Antoine | Sunday, 08 February 2009 at 10:30 AM
I would have to look seriously at device #2. Though #1 does have a single point of higher "IQ" at level 5 it increases swiftly from level 3 to level 5 then drops equally as fast to level 7. Device #2 may have slightly less peak "IQ" but it has a higher level of overall consistent "IQ" ranging from level 3 through level 8 and appears to increase progressively, hold peak "IQ" longer and drop smoothly all the way out to level 8 with little change.
Peak "IQ" may be better for device #1 if used ONLY at level 5 but for (my) day to day use I'd happily live with the slightly less "IQ" but more consistently reliable results from Device #2. With the limited info provided in the problem as posed Device #2 would rank higher overall for me.
Posted by: R.C. Poole | Sunday, 08 February 2009 at 10:35 AM
There is no "correct" abstract answer. This is a mathematical (over)simplification of the standard tradeoff that we all decide upon based on the realm in which we photograph. If device "1" can typically be used at settings to gain the highest quality 9 (e.g., sharpness or dynamic range or ?)then that's the choice. But if we only get "9" quality 10 percent of the time, but 80% of the time get better overall quality with Device-2 then that is the better tool. That's why reviews which accurately describe use (landscape, portrait, low-light, sports, back-country) are more useful for calibrating to each individual's unique situation. All-purpose tools do not exist.
Posted by: Warren Frederick | Sunday, 08 February 2009 at 10:36 AM
To further clarify the question, let's pretend you can't further clarify the question. Act like it's a printed exam question: you don't get to argue the likely the premises or ask for more information, you don't get to crib your neighbor's answer, and the only way you can flunk is by failing to give a clear answer. (I do get annoyed having to explain what "hypothetical question" means every time I ask a hypothetical question....)
Mike J.
Posted by: Mike Johnston | Sunday, 08 February 2009 at 10:36 AM
Device 2 for a "working" photographer that relies on consistency in the performance of his "tools"
Device 1 for the "creative" (ahem artistic)photographer who can live with mechanical limitations to reach for the situation where his "tools" allow for the "magic". i.e. Yamaha piano vs Steinway. Yamaha is consistent but Steinway "can" offer the edge to allow magic.
dale
Posted by: dale moreau | Sunday, 08 February 2009 at 10:37 AM
Which one will my lens fit?
Posted by: Clayton Lofgren | Sunday, 08 February 2009 at 10:38 AM
"Device 2" as it is more consistent in quality.
Posted by: Rich Read | Sunday, 08 February 2009 at 10:42 AM
This is a statistical problem. The average quality is:
Device 1: 5.1 +- 2.4
Device 2: 5.6 +- 1.5
the second value is the standard deviation (the smaller it is , the closer to the average all the values are), therefore:
Device 2 has a higher average quality and deviates less from the average in the different setings.
Of course if other criteria sets in (e.g. I will only use some particular settings) then the conclusion could be different,
Posted by: P Mendes | Sunday, 08 February 2009 at 10:43 AM
Device 2 for me.
Better throughout is more interesting for real life work than one optimum and several less across board. The single '9' would not mean anything to me as I cannot be held to shooting at that '9' setting.
I take pictures, though not hypothetically.
Posted by: Donald Giannatti | Sunday, 08 February 2009 at 10:45 AM
I'm going to assume each rating is for a lens at a particular f-stop here.
I'd personally go for Device 2 as its more consistent across the board. Being able to just stick with the one setting rated at 9 would be great, but in real world practice, I'd believe there would be many situations where I was forced to use the other settings. I'd prefer to not see a major variation in quality as teh conditions would force me to work towards the first & last settings.
Posted by: K. Praslowicz | Sunday, 08 February 2009 at 10:46 AM
assuming equal weights for all of the scores, I would go with Device 2; however, unequal weightings on the scores could easily change that.
Posted by: Colin Thomas | Sunday, 08 February 2009 at 10:48 AM
I'd have to go with Device 2 if this is all the information given. Except for one setting, it scores consistently the same or better than Device 1.
The one _crucial_ piece of information is what the settings are - which ones will be used the most. If the high-scorer for Device 1 is, oh “Automatic” for example then Device 1 should win overall, as many (most?) people leave it set there as a default (or set it there during events that you'd like to participate in as well as document).
But based on the _overall_ data (without context), Device 2 wins as it is a bit better most of the time. (I'm also assuming that 10 would be a prefect score)
Posted by: Ethan | Sunday, 08 February 2009 at 10:52 AM
With apologies to the early commenters, I just changed the question. Too few people were "getting" it....
Mike J.
Posted by: Mike Johnston | Sunday, 08 February 2009 at 10:52 AM
I would say device 2 because it appears to have the broadest capability but it really is subjective depending on your needs and goals.
Posted by: Brian | Sunday, 08 February 2009 at 10:52 AM
I'd rate device 2 higher in overall IQ. It meets or exceeds device 1 at 7 of the 8 settings and I suspect the difference between a 9 and a 7 would not be visible in all but the rarest of *my* prints or screen images.
Posted by: Pat Cooney | Sunday, 08 February 2009 at 10:55 AM
Hey, I'm a weirdo! Device 1 for me. Max of one compared to max of the other...
Posted by: Grega | Sunday, 08 February 2009 at 11:00 AM
Device 2: more consistent, and higher, across most of the image settings. However, a potential customer interested in using setting #5 most of the time might well discount Device 2's higher score on the other 7 settings, and choose to purchase Device 1.
Posted by: Joe | Sunday, 08 February 2009 at 11:01 AM
I gotta get my morning coffee before I can tackle this...
Posted by: Charles Mason | Sunday, 08 February 2009 at 11:02 AM
Device 2, let's face it. Device one sucks, it just got lucky once.
Posted by: Prasanth | Sunday, 08 February 2009 at 11:02 AM
Are they both weatherproofed? If so, I'd go with device 2, unless it was built by Canon.
JC
Posted by: John Camp | Sunday, 08 February 2009 at 11:06 AM
I would choose the canon one!
Posted by: Ramon Acosta | Sunday, 08 February 2009 at 11:09 AM
You can't.
Posted by: Bryan Hansel | Sunday, 08 February 2009 at 11:09 AM
Who knows?
In my early days in digital photography, when I was first deciding to move up from a $700 2MP Kodak P&S to a $1,200 Minolta Dimage A1 "prosumer," I assiduously compared numbers like these on DPReview to the point of obsessive distraction. Eventually I decided on the Minolta, plonked down my money (mostly AmEx points, truth be told) and started taking pictures - at which point the quibbles over how much weight to give to different numeric values of different features became moot and even irritating. I stopped looking at DPReveiw daily and started looking at sites like Luminous Landscape and TOP, when it came along, to learn how to take pictures instead.
I went through the same cycle when I moved up to the Konica Minolta 7D, although in that case unmeasured things, like my ownership of a Minolta flash weighed.
Again, when a new generation of high-ISO low-noise 12MP CMOS DSLRs came out I worried over how to weight and compare such strings of numeric values. Eventually I got the Nikon D300. Whenever I have regrets about that purchase it's never because of any of the numeric ratings on IQ - it's always about missing the intuitive ease and handling of my beloved Maxxum 7D. And most of that is personal to me, my 35 year history with Minolta gear, and will never be captured in any set of deceptively scientistic numeric scores.
There is no answer to this question except possibly for a few, rare individuals for whom the factors reported in the question are the only relevant ones and they know their correct personal weightings for each of them. For everyone else (and that's probably everyone) the best that can be said for the numbers is that they may be included in a larger, personal, more humanly subjective assessment.
Posted by: Adam Isler | Sunday, 08 February 2009 at 11:13 AM
P Mendes said:
"Device 1: 5.1 +- 2.4
Device 2: 5.6 +- 1.5
the second value is the standard deviation"
Unless Mike changed the numbers, the Standard Deviations for the two devices are actually 2.2 and 1.4.
Either way, assuming this is something like the ratings at different f stops, device 1 is capable of the best performance, even if only at one setting, so device 1 is the best.
Posted by: Peter Robinson | Sunday, 08 February 2009 at 11:14 AM
Assuming all criterion are equally weighted and since you didn't specify (on purpose?) what the maximum score was, in this ranking, for each criterion - was it 10 or 100 or...
So:
- you could give Device 2 a +4 rating (difference in scores)
- you could give Device 1 a -4 rating
- you could say that Device 2 has 5 times better image quality than Device 1 because Device 2 wins in 5 out of 6 categories (2 tied categories would cancel-out) and Device 1 wins just 1 out of 6 categories.
- you could say that Device 2 is 67% better than Deivce because (5/6 - 1/6) = 4/6 = 67%
That's enuff cypherin for me...my haid hurts!
Internet reviews are certainly have their flaws but if you can find enough of them, trends do sometimes develop.
It is also a big help if you know what you want the item to do so you can look for strong ratings in that specific criterion.
It still beats buying a pig in a poke (especially if you are buying on-line from a different country)...so long as you don't get eaten-up by analysis paralysis.
Cheers! Jay
Posted by: Jay Frew | Sunday, 08 February 2009 at 11:14 AM
Well... just what I needed to clear the Sunday morning fuzz out of my brain!
My intuitive answer is to rate Device 1 higher than Device 2. Mine has to be an intuitive answer because I have no training in statistics or higher mathematics to help me justify my choice. So why Device 1 then?
Knowing that I could achieve an image quality score of 9, even only at one "setting" is significant because achieving the best possible result is paramount to me. And 9 is 2 full steps up from the best possible Device 2 rating of 7. Had it only been 8, the choice would not be so clear to me. Of course this might mean using a tripod or giving up some other operating convenience since that "setting" may require a slower shutter speed etc.
Some folks might rate Device 2 as higher since there are more settings that give a higher rated result. The implication is that Device 2 is more flexible across a range of operating settings, which may be imposed by the prevailing conditions at the time of use.
Assuming these "devices" might be cameras, how about this scenario? Device 1 would be preferered by landscape photographers who can take their time and use the single setting that gives them a 9 rating and Device 2 is for sports and action photographers who may need a wider range of settings, none achieving a 9 rating but still yielding good results under a wider variety of circumstances with less time to manipluate settings.
It seems to boil down to the old idiom, "horses for courses".
Posted by: Reg Quiring | Sunday, 08 February 2009 at 11:17 AM
device one, I would give a 9
device two, I would give a 7
I would note what settings and what possible compromises would be needed to achieve the maximum image quality from each device.
Posted by: Edward Taylor | Sunday, 08 February 2009 at 11:24 AM
And how was the score computed?
Just let me test-drive both devices and I’ll be happy to tell everyone which one works better for me and why.
Until then numbers are meaningless!
(DxO Mark numbers for medium format camera sensors would not impress anyone but pictures sure would)
Posted by: Alex Seagull | Sunday, 08 February 2009 at 11:25 AM
Device 1 is capable of the highest image quality. Device 2 is appealing as a compromise as it is more consistent across the range, but does not offer the ability to produce the same level of "quality" as number 1.
Ranking them is merely a subjective decision. While you can average to your heart's content, the test shows that Device 1 is capable of the best quality and should therefore get the higher score.
So if you're Mike J., you'll say Device 1 is the more capable camera, but recommend Device 2 at the end of your comparison in an attempt to confuse and befuddle your readers. *big grin*
Posted by: Adirondack Pete | Sunday, 08 February 2009 at 11:32 AM
Interesting Mike,
If you were reflecting the actual dispute and the numbers concerned, then the numbers would be more like:
9 7 5 4 4 3 and
9 8 7 6 6 5
Since best quality was the same. Then it would be a no brainer.
Posted by: Rob | Sunday, 08 February 2009 at 11:33 AM
To answer the question as intended:
Device one gets a score of 5
Device two a score of 6
How do I get that? first I took the straight arithmetic mean, then the men excluding high and low scores (this second step is a simple way to adjust for variance - like they do in sports judging). I compared the 2 sets of numbers (5.12/5.00 & 5.62/5.82 respectively).
Of course all the other commenters have pointed out the trouble with giving a single score for a 2-dimensional problem.
Posted by: Martin Doonan | Sunday, 08 February 2009 at 11:41 AM
This question is way too vague and can't be answered.
Posted by: Anonymous | Sunday, 08 February 2009 at 11:50 AM
I would buy device 2 if their prices were close to each other. I would score them the same, perhaps.
I value "consistency" more than "stellar performance" limited to a narrow range.
Posted by: Bülent Celasun | Sunday, 08 February 2009 at 11:51 AM
Have each camera take a picture with identical settings under identical lighting. Do not post process (or apply identical post processing methods). Print each picture with the same printer. Evaluate the print. Which print looks more appealing? There's your "better" camera. It's about the print, not about the camera.
Posted by: Jason Anderson | Sunday, 08 February 2009 at 11:53 AM
Answer to rephrased question:
Based on my previous post device 1=5 and device 2=6. Unless subjective values are considered (aka different weights for the settings).
Posted by: P Mendes | Sunday, 08 February 2009 at 11:55 AM
Which one goes to eleven?
Posted by: Bill Rogers | Sunday, 08 February 2009 at 12:01 PM
This isn't an answer but ..
if you had two devices ( call them cameras )
device one gives you
12.5% quality 2
25% quality 3
25% quality 5
25% quality 7
12.5% quality 9
device 2 gives you
12.5% quality 3
12.5% quality 4
12.5% quality 5
25% quality 6
37.5% quality 7
put another way
device one
12.5% quality 9 or better
37.5% quality 7 or better
62.5% quality 5 or better
87.5% quality 3 or better
100% quality 2 or better
device two
37.5% quality 7 or better
62.5% quality 6 or better
75% quality 5 or better
87.5% quality 4 or better
100% quality 3 or better
so if you define good enough quality as
quality 9 you want device one
quality 7 you want device one
quality 6 you want device two
quality 5 you want device two
quality 4 you want device two
quality 3 you want device two
quality 2 you want device one
I habitually go with device one which makes people wonder why a camera that costs ten times as much only gets a third as many "good" photos.
Posted by: hugh crawford | Sunday, 08 February 2009 at 12:01 PM
Just to be different, I would say that the scores so far are generally too high. The quality is so inconsistent across the various settings, I'll go ahead and award them both a two, and send them back to the drawing board, maybe with fewer settings this time :)
Posted by: Michael Barkowski | Sunday, 08 February 2009 at 12:02 PM
The device's total score should be the max of its scores in any sub-field.
I'll take device #1, and make sure I use setting 5 as often as is applicable.
Posted by: PhysicsMan | Sunday, 08 February 2009 at 12:05 PM
Simple. Add up the numbers, divide by Pi and adjust for inflation.... Then get back to taking photographs.
Posted by: Tom | Sunday, 08 February 2009 at 12:06 PM
Too much speculation and too many assumptions.
Device #1 gets an 8 out of 10
Device #2 gets... 8 out of 10
Why? Because an equipment review score is subjective and only has true meaning to the reviewer. For me (the reviewer in this case) the peak quality of #1 and the consistency of #2 cancel each other out. Buy the one that's cheaper and be done with it (the previous post on "How to choose a digital P&S" comes to mind). I applaud DXO for trying to be empirical about cameras, but the recent listing of medium format sensors has shown that just because you can produce a number for comparison, it's not an accurate measure of the item's quality.
Posted by: JasonP | Sunday, 08 February 2009 at 12:11 PM
If you assign each setting an equal weight, then the solution is easy. Device 2 gets 45, Device 1 gets 41.
If you don't then it could be anything, and I'm pulling out the not enough information to make a meaningful decision card.
Posted by: cp | Sunday, 08 February 2009 at 12:20 PM
Device 1: 2. Device 2: 6.
Posted by: R.W. Bloomer | Sunday, 08 February 2009 at 12:33 PM
My statistics are a bit rusty, but I think the objective or mathematical way to go would be to calculate the geometric mean, i.e., in this case the 8th root of the product of the eight rankings. If I recall, this method is called for because the scale being used to do the ranking (1 - 10) is 'ordinal', arbitrary and integer based. A conventional mean / median computation would only be appropriate if the scoring scheme were in real numbers, i.e., numbers with decimals/fractions, or at least real quantities of things as opposed to a mere ranking.
Posted by: ed nixon | Sunday, 08 February 2009 at 12:34 PM
Hold on! Hold on!
We are asked a question and a bunch of us jump in with both feet, some with more verbosity than others.
Then a few minutes later we are told to “Act like it’s a printed exam question”.
Then a few minutes after that we are informed that “I just changed the question”.
Something is fishy here. I have never heard of anyone changing the question after the exam has started.
Looks like this exam is rigged and should be reported to the examination board.
Who would that be? Oh right this is Mike’s blog therefore he is the examination board ... never mind.
Carry on.
Posted by: Rich Read | Sunday, 08 February 2009 at 12:44 PM
Amazed anyone thinks they can answer based on the information given.
Say these are different lighting conditions you might encounter with your pocket camera - then you want 2, presumably, to cover more situations.
Say these are different positions of the focus ring: the pinhole "lens" is always in focus, but not very sharp, while your 50/1.4 really needs to be focused. But you'd prefer the lens, no?
Also amazed this thread is on its second page of comments already, although I'm guilty here too...
Posted by: improbable | Sunday, 08 February 2009 at 12:45 PM
I'm making an assumption here: that these settings are on a linear scale, such as exposure settings. The way the numbers are clustered around a curve just implies that to me. If that's the case, I would select the second one, because I always find myself struggling with the extreme ends of a limited range such as this, rather than comfortably staying in the middle.
Posted by: Dayv | Sunday, 08 February 2009 at 12:47 PM
Interesting question that highlights the limitations of a single score. My answer for general use would be a score of 5 for Device 1, 6 for Device 2. My assumption is that they're scored on a 10 point scale with 5 being the avg score. I just averaged the deviations around 5 to come up with the score. So my score is based upon deviation around the mean.
Posted by: Eric | Sunday, 08 February 2009 at 12:48 PM
I wouldn't give a damn either way. Concrete factors such as ergonomics, interface design, how the viewfinder and LCD look, how the shutter sounds, how big and heavy the body is, and what the images look like are a lot more important to me than a bunch of numbers. PLEASE NOTE: This is just my personal point of view. I'm not saying anyone else should feel the same way.
Posted by: Gordon Lewis | Sunday, 08 February 2009 at 12:51 PM
I'm very tired of all these "Image Quality" debates on the net. "Image Quality" has very little to do with photography. The important thing is what does it communicate to the viewer.
The photographs in Robert Frank's "The Americans" don't have great "Image Quality" but they will be remembered long after all the photos with "Image Quality" are forgotten.
Posted by: John A. Stovall | Sunday, 08 February 2009 at 12:58 PM
No.1 is good for landscape or still life.
No.2 is good for action/reportage.
If I know the user's needs I can give a grade
Otherwise assigning only a number as a grade for image quality is not useful.
Posted by: Jozef | Sunday, 08 February 2009 at 01:06 PM
both fail my standard. so, fail.
Posted by: jm44 | Sunday, 08 February 2009 at 01:13 PM
Device 1 = 5
Device 2 = 5.8
I figure the judges at figure skating must have a point, so delete lowest and highest score for each device and calculate average for the rest...
Best, Nick
Posted by: Nick | Sunday, 08 February 2009 at 01:16 PM
Looking at those scores I'm going to firstly carry on taking pictures with my current Device 0 whilst waiting eagerly to see what Device 3 brings to the party when it's released.
Posted by: Duncan | Sunday, 08 February 2009 at 01:17 PM
Doesn't it all depend on which settings are you going to use most ?
Averages etc. count for very little if you are always going to use settings 4, 5 and 6.
As with many things "best" does not really esist. There may be a "best for a purpose" and, more likely, a "best for a purpose and user's preferences".
That's why the overall score that some rewievers like to give (and over which there are endless debates in the forums) are fairly meaningless (to me, at least).
Posted by: Me | Sunday, 08 February 2009 at 01:20 PM
Grega: "Hey, I'm a weirdo! Device 1 for me. Max of one compared to max of the other..."
P. Mendez: "This is a statistical problem. The average quality is ..."
Since the task is to come up with a single numeric score, it *is* a statistical problem. That doesn't automatically mean a straight average, though. If we had enough information, the usual way to proceed would be to compute a *weighted* mean, not a simple average. That way, each setting would be given a weight proportional to its importance to the ranker. A person who really likes setting 5 would give it a high weight, and thus camera 1 would tend to rank higher than camera 2.
Here, though, we have no other information. So we have no basis for assigning different weights to different settings. So a simple average is really all we can do. Camera 2 comes out higher.
If for some reason setting 5 were, say, 4 times as important as any other setting (which were all equal), then camera 1 could come up on top. When I ran the calculation for various weights, camera 1 came out on top even for a relative weight of 2 for setting 5 - meaning even a slight preference for setting 5 would cause camera 1 to rank higher than camera 2.
There could be an exception for a person for whom the highest image quality is of very high importance, hang the exact details of the settings. Then instead of a weighted average, one would apply a MAX() function to all the settings, a la Grega.
The upshot of all this is that, since we have almost nothing to go on, we basically have to use equal weights. People like Grega go for using the MAX() function instead.
Posted by: Tom Passin | Sunday, 08 February 2009 at 01:31 PM
If this was a buying decision (the original question), I probably wouldn't buy either. Device 2 isn't outstanding at anything, and device 1 is limited in its excellence -- a good time to wait for the next model.
If I needed a number for a wide audience, I would cite the medians, not the means and. God forbid, never the standard deviations.
So device 2 wins in this comparison.
But then, I say it is (merely) Recommended, and everyone knows that both are being damned with faint praise.
scott
Posted by: scott kirkpatrick | Sunday, 08 February 2009 at 01:51 PM
I's a simple math problem : you just have to define a distance.
A simple "square" distance is d=|x+y| - as stated by Tony (in Knoxville), with such a distance, Device1 score 41 and device2 scores 45.
With an euclidian distance, d=sqrt(x²+y²), then Device1 scores 15.8 and device2 16.3. A bit closer, mmmm?
Let's put now a cubic distance, d=cubicroot(x3+y3) : device1 scores 11.99 and wins over device2 (which scores 11.88).
Given the marketing hype around device1 and device2 and their respective housing colors, which way to score do you prefer?
Numbers only tell you what you want them to tell you. ;o)
Posted by: Nicolas | Sunday, 08 February 2009 at 02:12 PM
I think Jim Phelps would not choose to accept this mission. As Mike intended, it's nonsense. I would ignore any commentator who thought that a single integer was sufficient to measure image quality, and I'm sure you guys would all do the same. Unfortunately, most people don't do this, which is why we get megapixel and high-ISO wars :-(
Posted by: Chris | Sunday, 08 February 2009 at 02:45 PM
I'd probably give them similar performance scores. Interested individuals can assess the particulars for themselves. Two devices can deserve similar ratings for different reasons, at least on paper.
Value, on the other hand, depends on circumstances. Like application. Or what if device #2 costs four times more than device #1? But is half the size and weight? But available only in shocking pink? But there is a five month waiting list for device #1? etc.
[These remind me of lens ratings at different apertures, but I'm sure many devices perform this way.]
Posted by: robert e | Sunday, 08 February 2009 at 02:50 PM
It's very hard to answer with the limited information given, but just looking at the numbers, it looks like two lenses are being compared at apertures ranging from f/2 to f/22... (I missed the original discussion, but this seems like a reasonable assumption, given that the numbers peak in the middle)
If that's the case, lens 1 is excellent at f/8, but lens 2 is equal or better at all other apertures. You'd have to know the application in order to tell which is the more attractive option. If you shoot at f/5.6 to f/11 all the time, it's option 1. If you like to shoot wide open, it's option 2.
If it's not lenses, then I'm barking up the wrong tree!
Posted by: Stuart | Sunday, 08 February 2009 at 02:52 PM
Device 1: 7
Device 2: 7
Posted by: Peter Cameron | Sunday, 08 February 2009 at 02:55 PM
Drop the low and high score from each as any precision device normally should be used in the middle of its range. Sum up the rest and divide by six. You end up with 6.0 and 6.1. In integer format, that's a 6 for each or a tie. Hows that for cooking the books!
Posted by: Omar | Sunday, 08 February 2009 at 03:06 PM
I would go with the medians and score Device 1 as 5 and Device 2 as 6. If these were lenses that would make good sense.
Posted by: nikosR | Sunday, 08 February 2009 at 03:08 PM
I wouldn't, unless you were holding an innocent person hostage and would only release them if I did. Any other answer simply contributes to the entirely unethical attempt to turn every aspect of the world into something to which a metric can be attached. It's a sociopathic manoeuvre, designed for the benefit of people who like to do sums rather than make judgement calls. In the UK, we've been living with governments that do this for the past twenty years. Sorry to reframe, but this kind of evil nonsense needs to be challenged wherever it is found.
The terrifying thing is, I have to deal regularly with people who make public policy on this kind of basis.
Posted by: hughlook | Sunday, 08 February 2009 at 03:24 PM
Well, I took the root mean square of the series to determine an integral rating.
So, device 1 is 5.6, so 6 and device 2 is 5.8, so 6. As with most answers, you can't simply look at the final result and get any really meaningful information unless you know more about the questions used to get to that answer.
Posted by: Nick | Sunday, 08 February 2009 at 03:26 PM
If you really, really are forced to boil this down to a single number from 1 to 10, I suggest using the root-mean-square, normalized to the given scale. That gives:
Device 1: 5.60
Device 2: 5.80
Or as an integer, 6 for both.
But single numbers are only one way to make an "at a glance" representation. For this particular application, a spider chart seems very appropriate. Like this:
http://chart.apis.google.com/chart?cht=r&chdl=Device+1|Device+2&chco=00FF0080,0000ff80&chm=h,CCCCCC,0,1.0,4.0|B,00FF0080,0,1.0,5.0|B,0000FF80,1,1.0,5.0&chds=1,9,1,9&chd=t:2,3,5,7,9,7,5,3,2|3,4,6,7,7,7,6,5,3&chs=460x460
Posted by: Matthew Miller | Sunday, 08 February 2009 at 03:41 PM
Device one score: Apple
Device two score: Orange
Tricky one this. It almost sent me Bananas…
Posted by: Adrian Malloch | Sunday, 08 February 2009 at 04:00 PM
Device #1 is my Kodak SLR/n, and I'll keep shooting with it until I can afford a D3X, or until a D700X is released.
Posted by: mudhouse | Sunday, 08 February 2009 at 04:01 PM
I used the old formula 1 scoring system which dropped the best and worse performance of each driver throughout the season and then tally the remainder. With that I rate device 1 with a score of 5 and device 2 with a score of 6 (rounded up average).
Posted by: steven palmer | Sunday, 08 February 2009 at 04:07 PM
Nigel: The numbers all go to eleven. Look...right across the board.
[...]
Marty: Why don't you just make ten louder and make ten be the top number and make that a little louder?
[pause]
Nigel: These go to eleven.
Posted by: Mark | Sunday, 08 February 2009 at 04:22 PM
The question doesn't have all the information we'd need. Most important, we have no idea if this is the kind of setting that you decide on once (like jpeg quality setting, "normal/vibrant/natural" images and so on); or a setting that you change all the time (like ISO or aperture).
In the first case, the first is better; in the second case I would probably prefer 2 - but that depends on the precise setting in question and the use I am putting the device to.
Posted by: Janne | Sunday, 08 February 2009 at 04:35 PM
Serves You Right Mike!
;~))
Cheers! Jay
Posted by: Jay Frew | Sunday, 08 February 2009 at 05:03 PM
I would average the top three scores.
Device 1 gets a score of 8.
Device 2 gets a score of 7.
Most people caring about IQ will be using the highest IQ settings and 3 seems like a good number.
Posted by: Sam | Sunday, 08 February 2009 at 05:09 PM
I'd grade them each a 5.-- Richard
Posted by: Richard | Sunday, 08 February 2009 at 05:31 PM
I wouldn't bother and simply say they are both probably good enough.
Posted by: Pete | Sunday, 08 February 2009 at 05:52 PM
I would compare those to two students doing the same test of 8 short exercises in a specific branch of studies...
As in many cases, the head of the school doesn't allow any marks that are not fully rounded.
Student 1 gets 5,125, that gets rounded to 5
Student 2 gets 5,625, that gets rounded to 6
If you are the teacher of that specific course, you would tend to overmark the student that knows something in each test (nerdy type) then the one who just fell on at least one exercise he knew something about (the jock type)...
A sadistic teacher might even be angered by Student 1 and give him a 4, leaving Student 2 at 6 to encourage him further vs the rest of the class (limited here to two individuals)!
It's a funnier way to rate the deviation factor, maybe...
Posted by: Margouillat | Sunday, 08 February 2009 at 05:58 PM
I'll avoid your new question too, :-) That's why different kinds of average are used:
Mean/Median/Mode/Range
Device 1: 5.125/5/5&7/7
Device 2: 5.625/6/7/4
So across the board Device 2 is generally best and more consistent. But I still won't choose a number because the actual meaning of the data-set determines which average is most appropriate. That's the only way to turn data into information. Garbage out/garbage in...
Posted by: Warren Frederick | Sunday, 08 February 2009 at 06:01 PM
They each get the same integer between 1 and 10, say 5.
Device 1 wins for top performance, device 2 wins for consistent performance.
bd
Posted by: bobdales | Sunday, 08 February 2009 at 06:22 PM
To answer the question using statistical analysis, one must give each setting the same weight. In other words, with eight settings, setting one must be used for 12.5% of the exposures, setting two must be used used for 12.5% of the exposures, and so forth for all eight settings.
In the real world, this just doesn't happen. For example, assume that the device is a fast prime lens and that the eight settings relate to aperture. It stands to reason that fast prime lenses are more often used at low f-stops (otherwise, why did you buy it?), but in this question, we don't even know whether setting 1 corresponds to an open aperture or a closed one.
What do I win?
Posted by: Bill Rogers | Sunday, 08 February 2009 at 06:36 PM
Pick the black one. Everyone knows that truly professional imaging devices are black.
Posted by: Bill Rogers | Sunday, 08 February 2009 at 06:38 PM
This is easy. I'm not a fan of "overall" scores. Be it dpreview's "highly recommended" or photodo's weighted numeric rating based on mtf charts, it's useless or harmful. Of course, many people are too lazy to figure out what to look at in a review, so they eat up such ratings, and they get what they deserve.
OK, but hypothetically, like say my job depended on it ... I'd come up with a set of percentages adding up to 100; how much of the time do people (in the target market for the type of camera we're rating) use setting 1, setting 2, etc. then weight the results accordingly. Ideally those percentages would be based on some kind of research.
Posted by: Dennis | Sunday, 08 February 2009 at 08:08 PM
I meant to follow up because I'd described what my answer would be without giving it. (Though that should be fine in hypothetical cases.)
But it struck me that the data as presented in the question is easy to understand and helpful, while a synthetic score would only obfuscate. So why do it? How about a graph instead? "Humpy" vs "gentle" curves?
Don't get me wrong--I have no problem with testing or numbers per se. I believe some, though not all, factors involved in image quality can be tested and measured. I believe it is ultimately all on the artist/artisan, but I also believe that it's better to have the right tools for the job, and that data can sometimes help one find the right tool. So I'm for more data; one can always ignore it. But how data is presented makes all the difference. In this case, a synthetic "score" trades off a great deal of information for little gain in succinctness. There's no point, other than as a game.
[For what it's worth, I had scored them 5.7 and 5.6. I arbitrarily used the mean, plus a fudge factor. Can't throw out high/low scores or high/low settings because in real life those can matter and I can't pre-judge application; plus, it's fair to assume that settings reflect design parameters. The consistency of device 2 is inherently reflected by the arbitrarily chosen methodology (a.c.m.), whereas the "9" is a special achievement worth an extra half point fudge to compensate for an inherent weakness in the a.c.m.
Obviously, no conclusions possible about which is the "better" device for user X on mission Y.]
Posted by: robert e | Sunday, 08 February 2009 at 09:18 PM
Generally, the settings are moot. The only thing that matters to me:
The IQ of RAW files developed by 3rd party software.
Why? In order to get beyond the image mangling tricks by the manufacturers, which they do either in camera or in their proprietary RAW software. (noise reduction, distortion correction, highlight tone priority, "picture styles", sharpening, etc.)
Posted by: Uncle Sam | Sunday, 08 February 2009 at 09:25 PM
At work I am involved in the design of computer chips. As we architect these devices we are constantly faced with the question of whether "option A" or "option B" is the better one. We can state many of the differences between these options numerically (clock frequency, data widths, cache sizes, etc.) However, deciding which architecture is best is not as easy as picking the one with the highest metric (e.g. the fastest clock speed). The only way we can really answer the question of which is better is to simulate how these components would fare when employed by typical users. The concept of a "realistic workload", running on a piece of software that simulates either "A" or "B" is key here.
Similarly, your hypothetical question cannot be answered unequivocally without knowing the relative importance of the metrics to potential users. And yes, this means each person may answer the question differently. Isn't this the reason why people buy and enjoy different cameras or lenses? For example, the surveillance user may only care about center resolution while the architecture photographer may care more about distortion.
Cameras (or lenses) are more than the sum of their parts. They can be used for many purposes and represent different tradeoffs. This cannot be summarized in a single number or ranking. Rather, a good reviewer will know enough about both the camera itself and its potential uses to make (a number of) observations that will (hopefully) be helpful to many of his readers. In other words, the usefulness of a review to me depends on how well the reviewer was able to anticipate my needs.
Posted by: Olivier Maquelin | Sunday, 08 February 2009 at 11:14 PM
In synthetic benchmarks, you use the geometric mean, not the arithmetic mean to generate a summary value. For the reason why, see spec.org.
That said, if the numbers reflect a subjective evaluation rather than an objective measurement, they are pretty meaningless. There is a method called the analytic hierarchy process that allows you to rank the options, but you normally need several iterations to figure out what the weights should be.
Posted by: Fazal Majid | Monday, 09 February 2009 at 12:46 AM
Neither. I already have device No. 3.
Posted by: Steve Smith | Monday, 09 February 2009 at 02:18 AM
I'd contemplate my pre-existing bias and then select whichever device best met it. I would probably be my current one, or the one I have promoted publicly.
Posted by: Paul Kierstead | Monday, 09 February 2009 at 10:01 AM
If you NEED quantification, then the standard deviation response above is correct, but like other people stated, it depends on what your typical setting usage dictates. Do I NEED a maximum aperture of f/2.8 if the lens is slightly soft at that setting and I typically shoot landscapes at f/8 where another lens is tack sharp?
Posted by: Jeff | Monday, 09 February 2009 at 11:46 AM