Wifcon.Forum Archives

Adjectival Rating System

By Anonymous on Monday, March 11, 2002 - 03:04 pm:

I have a question. How do you apply the adjectival rating system? Every time I attempt to use this system, the technical folks want to convert the ratings into numerical points (e.g. Excellent = 5, Good=4, Acceptable=3, etc.) When one uses the adjectival system do you just eyeball how many of each rating an offeror has racked up to rate their proposal? (e.g. Offeror A has 3 excellents, 1 good, 1 acceptable =excellent overall). I have read source selection guides in my office and on this website but it is all very vague when it comes to demonstrating how the system is actually applied.
Thank you in advance for any feedback.

By Vern Edwards on Monday, March 11, 2002 - 04:09 pm:

The problem that your technical people are having arises from the fact that it is difficult to combine the adjectival scores assigned to individual technical evaluation factors into a single, aggregated technical score that reflects overall nonprice value. There is no standard way to combine one exceptional with one marginal with one good with one fair into an a single, overall technical assessment.

Adjectival scores are usually categorical assessments, rather than true measurements. When that is the case it can be misleading to convert the adjectives to numbers, because two offerors can be good, but not necessarily equally good. Assigning the same number to both offerors and then combining the numbers might lead to faulty conclusions about the offerors' relative merits.

The problem isn't subjectivity; most evaluation findings are subjective, whether expressed as numbers or adjectives. The problem is to have a rational and consistent for aggregating individual factor scores. Numbers allow the evaluators to use the rules of arithmetic when aggregating scores, thereby ensuring combinatorial rationality and consistency. But there is no one right way to aggregate adjectival scores.

I don't have any specific advice other than to discuss these things openly during the development of the source selection plan and reach a consensus about how you will determine aggregate nonprice value. Other people here are going to tell you to determine each offerors' value by looking at the evaluation "narratives." That won't help your technical people, because they are looking for a way to simplify their evaluation data in order to make it more manageable. That's what scores do--they simplify "narrative" data in order to make it more manageable.

There are a number of books that you might find helpful. See Decision Analysis for Management Judgment, 2d ed., by Goodwin and Wright (1998), which is available from John Wiley & Sons.

By joel hoffman on Monday, March 11, 2002 - 04:38 pm:

In my opinion, from the description Anonymous provided, his technical evaluators are not just looking for a way to simplify the evaluation data to make it more manageable - they are looking for a short cut to avoid technical-cost trade-off evaluations. That has been my consistent experience in leading or participating in about 80 source selections. I'm sure that many of you folks here have probably particpated in many more, but that's enough to learn a lot of lessons.

People want a quick and easy way to make a selection. Total up the points, compare the overall colors... That's one of the easiest ways to invite and to lose a protest.
Overall scores, color or adjectival ratings are simply indicators.

Regardless of overall scores or colors or adjectival ratings, one must actually compare the strengths and weaknesses of the various proposals and determine the relative advantages and disadvantages of one proposal in relation to another, and compare these with the price differences.

As a member of the Design-Build Institute of America, I know that the Industry also likes the idea of simply adding up points and selecting the highest scored proposal or dividing total cost by total points. Never mind what's behind the points. It's human nature and they think it is less subjective, therefore good!

I know many people who also think that simply dividing total dollars by total points provides a clear understanding of the "best value." We gave that system up 8 or 9 years ago. It's only one "indicator" of value, at best and not even a very good indicator.
happy sails! joel

By Anonymous on Monday, March 11, 2002 - 05:13 pm:

If one cannot combine the ratings into one rating to describe the offeror's technical proposal, how does one use the ratings to determine the best value? I have been trained to have the panel evaluate the technical proposal. The RFP usually lists a number of Technical subfactors that will be evaluated by the tech panel. For example, for a service, the tech panel would evaluate management and technical approach, key personnel, etc. Evaluator A would evaluate the contractor's proposal and would ascribe a rating for each factor, e.g. Management = Acceptable; Technical Approach = Excellent; Key Personnel = Excellent. Evaluator B would do the same but may reach a different conclusion. For example, B may rate management Excellent; Technical Approach=Acceptable and Key Personnel=Excellent. How does one use this information to determine what offer represents the best value? Is it wrong to eyball the evaluations and give a contractor an overall rating of acceptable or excellent and then look at the cost proposals?What evaluation method do you recommend? Does your evaluation methodology change for different procurements? If it does, what factors determine your methodology? Thank you.

By Vern Edwards on Monday, March 11, 2002 - 05:57 pm:

Anonymous:

You don't use ratings or scores (I'm using the two words as synonyms) to determine best value. At most, you use ratings or scores to develop a preliminary ranking of the offerors from best to worst.

You determine best value by comparing the offerors to each other based on the detailed nonprice factor evaluation findings (the "narratives") and the prices in order to determine how and to what degree the offerors are different. If one offeror is better on the nonprice factors but has the higher price, then you must decide whether the specific differences in the nonprice factors are worth the difference in price. If the specific differences in the nonprice factors are worth the difference in price, then the higher-priced offeror is the better value. Otherwise, the lower-priced offeror is the better value. You continue to make paired comparisons in this way until you have decided which offeror is the best value.

Strictly speaking, rating or scoring is not necessary in source selection. It is just a way to summarize more voluminous information, making it easier to digest and think about. Ratings or scores work the same way as symbols in mathematics--they simplify information and facilitate thinking. E = mc2 is just a concise way of saying something much more complicated.

However, logically, if a source selection team has developed a well-constructed numerical scoring system and used it properly, and if the source selection authority is thoroughly familiar with that system, then the source selection authority could confidently determine best value by comparing scores alone. There are many books that explain how to do that. But the development and use of such systems seems to be beyond the ability of most government source selection teams (it requires special training), so I advise against it.

By joel hoffman on Monday, March 11, 2002 - 06:39 pm:

In addition to what Vern so well stated, I advise against individual assignment of ratings or scores. The Army requires a "consensus" rating. To establish such a rating, the team generally discusses and agrees on the pluses and minuses (strengths, weaknesses, deficiencies, etc.) for a factor and writes them down in a narrative. Then, the team assigns a rating consistent with the consensus narrative comments - using a pre-defined rating system with some type of standard for acceptable and higher or lower.

After first evaluating each individual proposal against the rating criteria, the team, THEN compares the proposals against each other. It's actually easier than it sounds or looks, provided you have carefully structured the evaluation criteria and proposal submission format/requirements. happy sails! joel

By Vern Edwards on Monday, March 11, 2002 - 06:49 pm:

I agree with Joel about individual assignment of ratings or scores, but I'd go one step further: Why have more than one evaluator? There may be valid reasons for having more than one--there are many technical issues and no single person has all the requisite knowledge; several offices are providing funds for the contract and each wants to be represented on the source selection team; there are a large number of proposals and you want to divvy up the work--but you shouldn't use more than one if you don't have to.

We're all so used to thinking about teams these days (traveling in herds) that we sometimes don't stop to ask whether it really takes more than one person to do the job.

By Anon2U on Monday, March 11, 2002 - 10:25 pm:

I agree that one person could be the evaluator (and would be so much easier), however that leaves the one person vulnerable to charges that he was biased in his eval. If three personnel have three fairly close evals and all agree on one final consensus, this eliminates most charges of bias, fraud or favoritism.

If the three evals are far apart, maybe the RFP is flawed or else the government don't really know what they want.

By joel hoffman on Tuesday, March 12, 2002 - 07:47 am:

Anon2U, surprisingly, I've never been on a board where the evaluators were far apart. One reason for using several evaluators is for synergy - three or four sets of eyes catch more info than one and can work well together. Of course, another is that the client wants to help make the decision, ESPECIALLY in design-build construction, where a design proposal is a major part of the selection criteria.

I was a committee of one on some "low price, technically acceptable" source selections. I had the KO as a final check on go/no-go criteria. happy sails! joel

By anon455 on Tuesday, March 12, 2002 - 05:26 pm:

Why have more than one evaluator? Let's change that one evaluator to the minimum effective evaluators. Why have more than the minimum effective evaluators?

As a political appointment for every faction with the slightest interest in the selection. I've even heard it stated in terms of having such and such division involved for the sole purpose of tainting one of their favorite people with guilt if things go wrong. Then the fault game can be played at its most effective to insulate anyone from real responsibility.

The probability the practice almost certainly mucks up the selection is not material. Botched selections will be trumpeted in internal publications as a wonderful job by all the stakeholders. "No sir, not our fault at all the thing went bad. Everyone with even a slight stake was part of that decision."

By joel hoffman on Tuesday, March 12, 2002 - 08:12 pm:

Sorry you feel that way, Anon455. I haven't found that to be true. happy sails! joel

By Vern Edwards on Tuesday, March 12, 2002 - 08:33 pm:

Personal experience, one way or another, is not a valid basis for any kind of generalization about a larger realm of experience. Agencies use more than one evaluator for all kinds of reasons, good and bad.

By formerfed on Wednesday, March 13, 2002 - 08:00 am:

NASA used one of the most intriguing concepts I've run across in thier "Mid-Range" approach several years ago. Part of it consisted of having a two person team do the entire acquisition - a technical person and the CO. Both would evaluate all apsects of industy's response and come to consensus. Of course, the CO likely world defer to the technical person's expertise on detailed issues. Likewise the CO's business judgment likely would prevail on other portions. It seemed so efficient and swift, but never caught on.

By anon455 on Wednesday, March 13, 2002 - 09:53 am:

Note I said "minimum effective" evaluators and not "one" evaluator. When an evaluation team is increased beyond a minimum effective number for reasonable objectives (technical coverage or work division as examples) the excess is generally, if not almost always, driven by politics.

Small over population may be nothing more than miscalculation. Examples of ten to do a job effectively done by five can usually be traced to the political drive. It does happen and that usually mucks up the process.

By joel hoffman on Wednesday, March 13, 2002 - 01:07 pm:

Sorry you've experienced such overkill. We have been successful at keeping the numbers down on our teams. We do so many that people get burned out, so the minimum numbers needed to do the job are what we end up with. We've been fortunate enough that the client agrees to one voting member to keep the numbers down - they can have additional advisors, if necessary. We put two or three in-house folks (PM, Engineering and Construction or just Engr.and Const.)on most construction or design-build trade-off method selection boards. For service contracts, usually 2-3, plus the installation client.

I've done the evaluation alone, for low price, tech acceptable method. happy sails! joel

By joel hoffman on Thursday, March 14, 2002 - 01:47 pm:

THE PROBLEM WITH EQUATING PRICE WITH POINTS

THE HYPOTHETICAL "PRICE/QUALITY POINTS" EXAMPLE

Given: Maximum Available Points = 1200

Government Estimate: $10 Million

Funds Available: $10.75 Million

(Sorry, I can't correctly tab this to make columns)

(OFFEROR) (PRICE-RANK) (QUALITY RANK) ($$/PT.)}

1 $10M-1 600 PTS-6 $16,667/PT.

2 $10.25M-2 615-5 $16,667

3 $10.5M-3 630-4 $16,667

4 $10.75M-4 645-3 $16,667

5 $11M-5 670-2 $16,418

6 $12M-6 800-1 $15,000

Question:
Which Proposal Offers the "Best Value" and Wins the Contract, If the Price/Quality Point Ratio Is the Basis of Award?

This example illustrates the fallacy of trying to oversimplify the Cost/Technical Trade-off Analysis. Note that price and quality must be equal in weight to use this type analysis. It is, at best, just one indicator of value. Points are not precise measures of value. The abuse of point scoring systems by over reliance on "number of points," rather than developing strong narratives with advantages and disadvantages, is a major contributor to the Army decision to ban the use of point scoring systems in 2001.

By joel hoffman on Thursday, March 14, 2002 - 01:53 pm:

THE PROBLEM WITH SCORING PRICE

Given: Maximum Points Available for Price or Quality: 1200

Government Estimate: $10 Million Funds Available: $10.75 Million

Price Score: Lowest Price = Full Points. .

Points for Higher Prices are Pro-rated by Ratio of Price to Lowest Price.

(OFFEROR) (PRICE-RANK) (PRICE-PTS) (QUALITY-RANK) (TOTAL PTS-RANK)

1 $10M-1 1200 PTS 600 PTS-1 1800 PTS-1

2 $10.25M-2 1171 615-5 1786-3

3 $10.5M-3 1143 630-4 1773-4

4 $10.75M-4 1116 645-3 1761-6

5 $11M-5 1091 670-2 1761-5

6 $12M-6 1000 800-1 1800-1

Question:

Which Proposal Offers the "Best Value" and Wins the Contract, using this scoring system as the award criteria?

· This illustrates the fallacy in trying to score price. Who can say what a "price point" is worth?

· A Cost/technical Trade-off Analysis is absolutely necessary.

· Points are not precise measures of value.

By Vern Edwards on Thursday, March 14, 2002 - 03:12 pm:

Joel:

I love ya, Buddy, but your example, aside from the fact that it's hard to figure out in the format in which you've set it up (next time, use Bob's instructions to make a table), is ridiculous. You've set up a ridiculous example and then tried to use it to establish a general principle--that the assignment of numerical scores to price is a bad idea. All you've done is shown us a badly-designed scheme.

If an idiot had done this I wouldn't bother to say much, but you're not an idiot. What's more, you're an engineer, so you are mathematically competent. You know about functions, scales, and scale conversions.

If you don't like the idea of assigning numerical scores to price, that's okay with me. Lots of people don't. But if you want to persuade others that it's logically unsound and that it cannot be done effectively, others than idiots, that is, then make a better argument. Much better. Only the innumerate would assent to your generalization based on your example. I showed your example to my high school senior and he cracked up. His logic is good enough for him to understand that you use a dead horse to prove that an elephant doesn't have a trunk.

There are a large number of good college textbooks and other books that describe schemes known by names such as "simple additive weighting (SAW)" or "simple multiattribute rating technique (SMART)" that make an ass out of your argument. Have you read even one of the books that I've recommended at least ten times over the past few years here at Wifcon and at the old Watercooler? Even one, Joel?

Shame on you for this.

Your friend (really),

Vern

P.S. I wouldn't be so upset about this if I didn't think so highly of you.

By joel hoffman on Thursday, March 14, 2002 - 04:37 pm:

Vern, unfortunately, many people that have been scoring price don't read those books, hence the problem. Oversimplification of the process is exactly the problem. happy sails! joel

By joel hoffman on Thursday, March 14, 2002 - 04:39 pm:

(Vern, I read the one you recommended which was co-written by an Edwards, last year. Is he your brother? happy sails!)

By Vern Edwards on Thursday, March 14, 2002 - 07:07 pm:

Joel:

If you read Edwards and Von Winterfeldt, then you should know better. The problem with systems in which numerical scores are assigned to both nonprice factors and to price is that government teams aren't willing to do the hard thinking required to develop a sound decision analysis system.

You asked, "Who can say what a 'price point' is worth?" Well, with respect to your example, nobody, because you didn't tell us (a) what the technical factors are, (b) what scales were used to measure them, (c) whether the scales used are interval or ratio, (d) what functions were used to convert the factor measurements to scores, and (d) the factor importance weights. You rigged it so that nobody could answer your question, then you argued that you had roved something to us.

People respect you, Joel, which is why I think you have a special responsibility in these things and why I got so mad about your example. I agree with you that most government teams should not try to use total numerical scoring systems, but I agree because I think, as you do, that most of them are not competent to do so, not because such systems are inherently unsound.

No, Professor Ward Edwards and I are not related and we have not met.

Vern