By Anonymous
on Monday, March 11, 2002 - 03:04 pm:
I have a question. How do you
apply the adjectival rating system? Every time I attempt to use
this system, the technical folks want to convert the ratings
into numerical points (e.g. Excellent = 5, Good=4, Acceptable=3,
etc.) When one uses the adjectival system do you just eyeball
how many of each rating an offeror has racked up to rate their
proposal? (e.g. Offeror A has 3 excellents, 1 good, 1 acceptable
=excellent overall). I have read source selection guides in my
office and on this website but it is all very vague when it
comes to demonstrating how the system is actually applied.
Thank you in advance for any feedback.
By
Vern Edwards on Monday, March 11, 2002 - 04:09 pm:
The problem that your technical
people are having arises from the fact that it is difficult to
combine the adjectival scores assigned to individual technical
evaluation factors into a single, aggregated technical score
that reflects overall nonprice value. There is no standard way
to combine one exceptional with one marginal with
one good with one fair into an a single, overall
technical assessment.
Adjectival scores are usually categorical assessments, rather
than true measurements. When that is the case it can be
misleading to convert the adjectives to numbers, because two
offerors can be good, but not necessarily equally good.
Assigning the same number to both offerors and then combining
the numbers might lead to faulty conclusions about the offerors'
relative merits.
The problem isn't subjectivity; most evaluation findings are
subjective, whether expressed as numbers or adjectives. The
problem is to have a rational and consistent for aggregating
individual factor scores. Numbers allow the evaluators to use
the rules of arithmetic when aggregating scores, thereby
ensuring combinatorial rationality and consistency. But there is
no one right way to aggregate adjectival scores.
I don't have any specific advice other than to discuss these
things openly during the development of the source selection
plan and reach a consensus about how you will determine
aggregate nonprice value. Other people here are going to tell
you to determine each offerors' value by looking at the
evaluation "narratives." That won't help your technical people,
because they are looking for a way to simplify their evaluation
data in order to make it more manageable. That's what scores
do--they simplify "narrative" data in order to make it more
manageable.
There are a number of books that you might find helpful. See
Decision Analysis for Management Judgment, 2d ed., by
Goodwin and Wright (1998), which is available from John Wiley &
Sons.
By
joel hoffman on Monday, March 11, 2002 - 04:38 pm:
In my opinion, from the
description Anonymous provided, his technical evaluators are not
just looking for a way to simplify the evaluation data to make
it more manageable - they are looking for a short cut to avoid
technical-cost trade-off evaluations. That has been my
consistent experience in leading or participating in about 80
source selections. I'm sure that many of you folks here have
probably particpated in many more, but that's enough to learn a
lot of lessons.
People want a quick and easy way to make a selection. Total up
the points, compare the overall colors... That's one of the
easiest ways to invite and to lose a protest.
Overall scores, color or adjectival ratings are simply
indicators.
Regardless of overall scores or colors or adjectival ratings,
one must actually compare the strengths and weaknesses of the
various proposals and determine the relative advantages and
disadvantages of one proposal in relation to another, and
compare these with the price differences.
As a member of the Design-Build Institute of America, I know
that the Industry also likes the idea of simply adding up points
and selecting the highest scored proposal or dividing total cost
by total points. Never mind what's behind the points. It's human
nature and they think it is less subjective, therefore good!
I know many people who also think that simply dividing total
dollars by total points provides a clear understanding of the
"best value." We gave that system up 8 or 9 years ago. It's only
one "indicator" of value, at best and not even a very good
indicator.
happy sails! joel
By
Anonymous
on Monday, March 11, 2002 - 05:13 pm:
If one cannot combine the ratings
into one rating to describe the offeror's technical proposal,
how does one use the ratings to determine the best value? I have
been trained to have the panel evaluate the technical proposal.
The RFP usually lists a number of Technical subfactors that will
be evaluated by the tech panel. For example, for a service, the
tech panel would evaluate management and technical approach, key
personnel, etc. Evaluator A would evaluate the contractor's
proposal and would ascribe a rating for each factor, e.g.
Management = Acceptable; Technical Approach = Excellent; Key
Personnel = Excellent. Evaluator B would do the same but may
reach a different conclusion. For example, B may rate management
Excellent; Technical Approach=Acceptable and Key
Personnel=Excellent. How does one use this information to
determine what offer represents the best value? Is it wrong to
eyball the evaluations and give a contractor an overall rating
of acceptable or excellent and then look at the cost
proposals?What evaluation method do you recommend? Does your
evaluation methodology change for different procurements? If it
does, what factors determine your methodology? Thank you.
By
Vern Edwards on Monday, March 11, 2002 - 05:57 pm:
Anonymous:
You don't use ratings or scores (I'm using the two words as
synonyms) to determine best value. At most, you use ratings or
scores to develop a preliminary ranking of the offerors
from best to worst.
You determine best value by comparing the offerors to each other
based on the detailed nonprice factor evaluation findings (the
"narratives") and the prices in order to determine how and to
what degree the offerors are different. If one offeror is better
on the nonprice factors but has the higher price, then you must
decide whether the specific differences in the nonprice
factors are worth the difference in price. If the
specific differences in the nonprice factors are worth the
difference in price, then the higher-priced offeror is the
better value. Otherwise, the lower-priced offeror is the better
value. You continue to make paired comparisons in this way until
you have decided which offeror is the best value.
Strictly speaking, rating or scoring is not necessary in source
selection. It is just a way to summarize more voluminous
information, making it easier to digest and think about. Ratings
or scores work the same way as symbols in mathematics--they
simplify information and facilitate thinking. E = mc2 is just a
concise way of saying something much more complicated.
However, logically, if a source selection team has developed a
well-constructed numerical scoring system and used it properly,
and if the source selection authority is thoroughly familiar
with that system, then the source selection authority could
confidently determine best value by comparing scores alone.
There are many books that explain how to do that. But the
development and use of such systems seems to be beyond the
ability of most government source selection teams (it requires
special training), so I advise against it.
By
joel hoffman on Monday, March 11, 2002 - 06:39 pm:
In addition to what Vern so well
stated, I advise against individual assignment of ratings or
scores. The Army requires a "consensus" rating. To establish
such a rating, the team generally discusses and agrees on the
pluses and minuses (strengths, weaknesses, deficiencies, etc.)
for a factor and writes them down in a narrative. Then, the team
assigns a rating consistent with the consensus narrative
comments - using a pre-defined rating system with some type of
standard for acceptable and higher or lower.
After first evaluating each individual proposal against the
rating criteria, the team, THEN compares the proposals against
each other. It's actually easier than it sounds or looks,
provided you have carefully structured the evaluation criteria
and proposal submission format/requirements. happy sails! joel
By
Vern Edwards on Monday, March 11, 2002 - 06:49 pm:
I agree with Joel about
individual assignment of ratings or scores, but I'd go one step
further: Why have more than one evaluator? There may be valid
reasons for having more than one--there are many technical
issues and no single person has all the requisite knowledge;
several offices are providing funds for the contract and each
wants to be represented on the source selection team; there are
a large number of proposals and you want to divvy up the
work--but you shouldn't use more than one if you don't have to.
We're all so used to thinking about teams these days (traveling
in herds) that we sometimes don't stop to ask whether it really
takes more than one person to do the job.
By
Anon2U on Monday, March 11, 2002 - 10:25 pm:
I agree that one person could be
the evaluator (and would be so much easier), however that leaves
the one person vulnerable to charges that he was biased in his
eval. If three personnel have three fairly close evals and all
agree on one final consensus, this eliminates most charges of
bias, fraud or favoritism.
If the three evals are far apart, maybe the RFP is flawed or
else the government don't really know what they want.
By
joel hoffman on Tuesday, March 12, 2002 - 07:47 am:
Anon2U, surprisingly, I've never
been on a board where the evaluators were far apart. One reason
for using several evaluators is for synergy - three or four sets
of eyes catch more info than one and can work well together. Of
course, another is that the client wants to help make the
decision, ESPECIALLY in design-build construction, where a
design proposal is a major part of the selection criteria.
I was a committee of one on some "low price, technically
acceptable" source selections. I had the KO as a final check on
go/no-go criteria. happy sails! joel
By
anon455 on Tuesday, March 12, 2002 - 05:26 pm:
Why have more than one evaluator?
Let's change that one evaluator to the minimum effective
evaluators. Why have more than the minimum effective evaluators?
As a political appointment for every faction with the slightest
interest in the selection. I've even heard it stated in terms of
having such and such division involved for the sole purpose of
tainting one of their favorite people with guilt if things go
wrong. Then the fault game can be played at its most effective
to insulate anyone from real responsibility.
The probability the practice almost certainly mucks up the
selection is not material. Botched selections will be trumpeted
in internal publications as a wonderful job by all the
stakeholders. "No sir, not our fault at all the thing went bad.
Everyone with even a slight stake was part of that decision."
By
joel hoffman on Tuesday, March 12, 2002 - 08:12 pm:
Sorry you feel that way, Anon455.
I haven't found that to be true. happy sails! joel
By
Vern Edwards on Tuesday, March 12, 2002 - 08:33 pm:
Personal experience, one way or
another, is not a valid basis for any kind of generalization
about a larger realm of experience. Agencies use more than one
evaluator for all kinds of reasons, good and bad.
By
formerfed on Wednesday, March 13, 2002 - 08:00 am:
NASA used one of the most
intriguing concepts I've run across in thier "Mid-Range"
approach several years ago. Part of it consisted of having a two
person team do the entire acquisition - a technical person and
the CO. Both would evaluate all apsects of industy's response
and come to consensus. Of course, the CO likely world defer to
the technical person's expertise on detailed issues. Likewise
the CO's business judgment likely would prevail on other
portions. It seemed so efficient and swift, but never caught on.
By
anon455 on Wednesday, March 13, 2002 - 09:53 am:
Note I said "minimum effective"
evaluators and not "one" evaluator. When an evaluation team is
increased beyond a minimum effective number for reasonable
objectives (technical coverage or work division as examples) the
excess is generally, if not almost always, driven by
politics.
Small over population may be nothing more than miscalculation.
Examples of ten to do a job effectively done by five can usually
be traced to the political drive. It does happen and that
usually mucks up the process.
By
joel hoffman on Wednesday, March 13, 2002 - 01:07 pm:
Sorry you've experienced such
overkill. We have been successful at keeping the numbers down on
our teams. We do so many that people get burned out, so the
minimum numbers needed to do the job are what we end up with.
We've been fortunate enough that the client agrees to one voting
member to keep the numbers down - they can have additional
advisors, if necessary. We put two or three in-house folks (PM,
Engineering and Construction or just Engr.and Const.)on most
construction or design-build trade-off method selection boards.
For service contracts, usually 2-3, plus the installation
client.
I've done the evaluation alone, for low price, tech acceptable
method. happy sails! joel
By
joel hoffman on Thursday, March 14, 2002 - 01:47 pm:
THE PROBLEM WITH EQUATING PRICE
WITH POINTS
THE HYPOTHETICAL "PRICE/QUALITY POINTS" EXAMPLE
Given: Maximum Available Points = 1200
Government Estimate: $10 Million
Funds Available: $10.75 Million
(Sorry, I can't correctly tab this to make columns)
(OFFEROR) (PRICE-RANK) (QUALITY RANK) ($$/PT.)}
1 $10M-1 600 PTS-6 $16,667/PT.
2 $10.25M-2 615-5 $16,667
3 $10.5M-3 630-4 $16,667
4 $10.75M-4 645-3 $16,667
5 $11M-5 670-2 $16,418
6 $12M-6 800-1 $15,000
Question:
Which Proposal Offers the "Best Value" and Wins the Contract, If
the Price/Quality Point Ratio Is the Basis of Award?
This example illustrates the fallacy of trying to oversimplify
the Cost/Technical Trade-off Analysis. Note that price and
quality must be equal in weight to use this type analysis. It
is, at best, just one indicator of value. Points are not precise
measures of value. The abuse of point scoring systems by over
reliance on "number of points," rather than developing strong
narratives with advantages and disadvantages, is a major
contributor to the Army decision to ban the use of point scoring
systems in 2001.
By
joel hoffman on Thursday, March 14, 2002 - 01:53 pm:
THE PROBLEM WITH SCORING PRICE
Given: Maximum Points Available for Price or Quality: 1200
Government Estimate: $10 Million Funds Available: $10.75 Million
Price Score: Lowest Price = Full Points. .
Points for Higher Prices are Pro-rated by Ratio of Price to
Lowest Price.
(OFFEROR) (PRICE-RANK) (PRICE-PTS) (QUALITY-RANK) (TOTAL
PTS-RANK)
1 $10M-1 1200 PTS 600 PTS-1 1800 PTS-1
2 $10.25M-2 1171 615-5 1786-3
3 $10.5M-3 1143 630-4 1773-4
4 $10.75M-4 1116 645-3 1761-6
5 $11M-5 1091 670-2 1761-5
6 $12M-6 1000 800-1 1800-1
Question:
Which Proposal Offers the "Best Value" and Wins the Contract,
using this scoring system as the award criteria?
· This illustrates the fallacy in trying to score price. Who can
say what a "price point" is worth?
· A Cost/technical Trade-off Analysis is absolutely necessary.
· Points are not precise measures of value.
By
Vern Edwards on Thursday, March 14, 2002 - 03:12 pm:
Joel:
I love ya, Buddy, but your example, aside from the fact that
it's hard to figure out in the format in which you've set it up
(next time, use Bob's instructions to make a table), is
ridiculous. You've set up a ridiculous example and then tried to
use it to establish a general principle--that the assignment of
numerical scores to price is a bad idea. All you've done is
shown us a badly-designed scheme.
If an idiot had done this I wouldn't bother to say much, but
you're not an idiot. What's more, you're an engineer, so you are
mathematically competent. You know about functions, scales, and
scale conversions.
If you don't like the idea of assigning numerical scores to
price, that's okay with me. Lots of people don't. But if you
want to persuade others that it's logically unsound and that it
cannot be done effectively, others than idiots, that is, then
make a better argument. Much better. Only the innumerate would
assent to your generalization based on your example. I showed
your example to my high school senior and he cracked up. His
logic is good enough for him to understand that you use a dead
horse to prove that an elephant doesn't have a trunk.
There are a large number of good college textbooks and other
books that describe schemes known by names such as "simple
additive weighting (SAW)" or "simple multiattribute rating
technique (SMART)" that make an ass out of your argument. Have
you read even one of the books that I've recommended at least
ten times over the past few years here at Wifcon and at the old
Watercooler? Even one, Joel?
Shame on you for this.
Your friend (really),
Vern
P.S. I wouldn't be so upset about this if I didn't think so
highly of you.
By
joel hoffman on Thursday, March 14, 2002 - 04:37 pm:
Vern, unfortunately, many people
that have been scoring price don't read those books, hence the
problem. Oversimplification of the process is exactly the
problem. happy sails! joel
By
joel hoffman on Thursday, March 14, 2002 - 04:39 pm:
(Vern, I read the one you
recommended which was co-written by an Edwards, last year. Is he
your brother? happy sails!)
By
Vern Edwards on Thursday, March 14, 2002 - 07:07 pm:
Joel:
If you read Edwards and Von Winterfeldt, then you should know
better. The problem with systems in which numerical scores are
assigned to both nonprice factors and to price is that
government teams aren't willing to do the hard thinking required
to develop a sound decision analysis system.
You asked, "Who can say what a 'price point' is worth?" Well,
with respect to your example, nobody, because you didn't tell us
(a) what the technical factors are, (b) what scales were used to
measure them, (c) whether the scales used are interval or ratio,
(d) what functions were used to convert the factor measurements
to scores, and (d) the factor importance weights. You rigged it
so that nobody could answer your question, then you argued that
you had roved something to us.
People respect you, Joel, which is why I think you have a
special responsibility in these things and why I got so mad
about your example. I agree with you that most government teams
should not try to use total numerical scoring systems, but I
agree because I think, as you do, that most of them are not
competent to do so, not because such systems are inherently
unsound.
No, Professor Ward Edwards and I are not related and we have not
met.
Vern |