Wifcon.Forum Archives

Use of Number Scores in DoD Source Selection

By K.C. on Wednesday, September 04, 2002 - 06:56 pm:

I'm currently supporting a program office that is considering using numerical scores in a source selection. The contracting officer states that 1) "Numbers always lead to a protest" and 2) "GAO doesn't like numbers used." I understand that in general, the sole use of numbers without justification is not considered appropriate ...

Question A) Does anyone have any empirical data regarding Numerical Scores vs. Adjectival Ratings in protest situations?

Question B) Does anyone know of a GAO (or other Court decision) that provides that one shall not ever use numerical scores -- or the opposite, where GAO found use of numerical scores adequate?

By joel hoffman on Wednesday, September 04, 2002 - 07:53 pm:

I don't think that you will find a court or GAO decision that says not to use points. The problems have been in the application and execution of the rating system. Scores should be assigned based on the narrative evaluation, not the other way around. And scores aren't absolute discriminators, only indicators of quality. Scores have been taken literally, as the discriminating factor in the selection process. The narrative strengths, weaknesses and deficiencies are the real discriminators.

If this is an Army conducted acquisition, the AFARS, 5115.304 -- Evaluation Factors and Significant Subfactors, prohibits use of numerical scoring or weighting the relative importance between the factors and subfactors. If this is an acquisition by the Air Force, they also prescribe a qualitative rating system in their supplement, not a numerical rating system. Does this help? happy sails! joel hoffman

By Vern Edwards on Wednesday, September 04, 2002 - 07:59 pm:

K.C.:

I hope that you realize that you have probably started a big fight among us.

Okay, first, your CO was wrong when he/she said that numbers always lead to a protest. That's nuts.

Second, it's not true that the GAO does not like numerical scoring. What the GAO doesn't like is any scoring system that is not backed up by documentation of evaluation findings concerning each offeror's performance on the evaluation factors for award. Such documentation is sometimes referred to as "narrative" descriptions of "strengths" and "weaknesses." Here is the closest thing you will get to a policy statement by the GAO:

"Our Office does not independently evaluate proposals, but instead reviews the agency's evaluation to ensure that it is reasonable and consistent with the solicitation's stated evaluation scheme. Westinghouse Gov't and Envtl. Servs. Co.,Inc., B-280928 et al., Dec. 4, 1998, 99-1 CPD ¶ 3 at 5. When considering the ratings assigned by an agency to an offeror's proposal we have consistently taken the position that evaluation ratings, be they adjectival, numerical or color, are merely guides for intelligent decision-making in the procurement process. KBM Group, Inc., B-281919, B-281919.2, May 3, 1999, 99-1 CPD ¶ 118 at 11. The relevant consideration in our review of an agency's evaluation is whether the record demonstrates that the agency reasonably assessed the relative merits of the proposals in accordance with the stated criteria. Id."

In response to your questions:

A) I doubt that anyone has compiled statistics pertaining to the number of protests sustained because of errors in numerical scoring vs. the number sustained because of errors in adjectival rating. However, you could compile such statistics if you were willing to put in many, many hours of research in a law library or at a computer.

B) I am not aware of any protest decision in which a court or the GAO said that an agency may not ever use numerical scoring. (Did I understand your question rightly?) However, there have been many decisions in which a court or the GAO has found that numerical scoring was adequate.

If you understand the reasons for using "scores" or "ratings" and the underlying principles of scoring, then you probably understand why a well-constructed numerical scoring system is superior to adjectival and color rating systems. But having said that much, I have come to accept after twenty years of teaching source selection that most government contracting personnel do not understand the underlying principles of scoring, are not competent in the proper design and use of numerical scoring systems, and therefore should not use them. Not merely because they would lose a protest, which is actually pretty hard to do, but because they may make a bad decision.

Finally, if you keep your evaluation plan simple enough, you shouldn't need any type of scoring system, numerical, adjectival, or color.

By joel hoffman on Wednesday, September 04, 2002 - 08:10 pm:

P.S., I know the leader of the team at Army Acquisition, which wrote the ~April 2002 change to AFARS, prohibiting numerical scoring. She explained that it was an internal decision, not because of any one GAO decision. It was a decision based on the overall problem I described above.

Many of my Corps of Engineers associates had big heartburn over the decision and many were misusing the scoring method, as described in my last post.

Once they realized that the narrative evaluation for a factor drives the rating, not assigning a score, then developing a rational to support it, they quickly adapted to the adjectival rating systems. The most difficult aspect for most of us has been not being able to assign a "weight" to the relative importance of the various factors and subfactors. But we've adapted! happy sails! joel

By joel hoffman on Wednesday, September 04, 2002 - 09:01 pm:

Vern, I agree with your preference for a well designed scoring system as well as your assessment of most people's comprehension and application of scoring systems in actual practice. Too many think 1) a total score is a precise distinguisher, 2) the relative weights between factors or subfactors are precise, 3) that you assign a score, then develop the narrative to justify the score, 4) that the difference in scores is enough to justify the selection, without a trade-off, etc., etc. happy sails! joel

By Vern Edwards on Wednesday, September 04, 2002 - 09:20 pm:

Joel:

AFARS 5115.304(b)(iv) is a perfect example of the ignorance and incompetence problem:

"Must be qualitative. Numerical weighting (i.e., assigning points or percentages to evaluation factors and subfactors) is not an authorized method of expressing the relative importance of these factors and subfactors. Evaluation factors and subfactors must be definable in readily understood qualitative terms (i.e., adjectival, colors, or other indicators, but not numbers) and represent the key areas of importance to be considered in the source selection process. The direction of this subparagraph is not waivable, either on an individual or class basis, as an AFARS deviation."

Note the phrase: "Numerical weighting (i.e., assigning points or percentages to evaluation factors and subfactors)... ." When scoring, you don't assign points to evaluation factors, you assign them to proposals. Moreover, scoring and weighting are different things entirely, yet the writer appears to be confused about this ("i.e., assigning points or percentages").

Also, it's funny that the evaluation factors used to select service contractors must be "qualitative," while FAR requires agencies to develop "measurable" performance standards for services.

By K.C. on Wednesday, September 04, 2002 - 09:40 pm:

Thanks for the feedback. I support the Navy, which has, as you likely know, mostly moved to adjectival ratings b/c people used scores to justify the decision instead of analysis and documentation to support the scores ... The program office is in a frustrating environment, where the PCO and lawyers respond with urban myths ("numbers=protest"), vs. research.

Vern, from what decision did you cut and paste the GAO "policy" statement? Also, I'm intrigued -- what is a "simple" evaluation plan with no rating scheme?

Again, REALLY appreciate the thoughtful, researched feedback.

By joel hoffman on Wednesday, September 04, 2002 - 09:47 pm:

Vern, agreed.

The language in the AFARS was a last minute revision to the AFARS rewrite, after the draft document was sent out for review in the Spring of 2001. When the new Army acquisition executive came on board, he simply decreed that no more numerical weights or rating systems would be used and that he didn't intend to debate anyone about it or listen to any other viewpoint. It was simply an order. happy sails! joel

By Vern Edwards on Wednesday, September 04, 2002 - 10:23 pm:

K.C.:

Tennier Industries, Inc., B-286706.2, 2002 CPD ¶ 75, March 14, 2001.

By Vern Edwards on Wednesday, September 04, 2002 - 10:46 pm:

K.C.:

A simple plan would include the following factors:

1. offer acceptability (offeror's assent to the terms of the RFP, evaluated on a pass/fail basis)(this factor includes the socio-economic program evaluation factors);

2. risk (based on offeror experience and past performance); and,

3. price or cost to the government.

You make offer acceptability and risk significantly more important than price or cost and equal to each other.

Offerors who submit unacceptable offers are eliminated from further consideration without consideration of their experience, past performance or price.

You then determine and summarize the experience and past performance of each offeror who submitted an acceptable offer. You combine your findings under the rubric of "risk". (The better the experience and past performance, the lower the risk.)

You then make a risk/price or cost tradeoff analysis in order to select a winner.

In order to perform a tradeoff analysis you compare the offerors to each other in a series of pairs. If, in any pairing, one offeror has better experience and past performance and a lower price, then it is the better value. If one offeror has the better experience and past performance and the higher price, then you subjectively decide whether the lower risk is worth the higher price. Making this decision entails deciding whether you want to pay a higher price to avoid the risk, or take the risk in the hope of paying a lower price. The decision will withstand any protest as long as it is within the realm of reasonableness.

You continue in this way until you have compared all the offerors who submitted acceptable offers and determined which is the best value.

By bob antonio on Thursday, September 05, 2002 - 05:26 am:

K. C.

You can find Tennier here.

http://www.wifcon.com/pd15_305factors.htm

By Linda Koone on Thursday, September 05, 2002 - 08:07 am:

K.C.

Also being with the Navy, I thought I'd point out that the Navy Supplement (NAPS)to the FAR/DFARS does not specifically prohibit the use of assigning point scores to cost or price, but provides guidance that generally discourages it at 5215.305(a)(1), which states, in part . . .

"(1) Cost or price evaluation. Methods of evaluation which assign a point score to cost or price and combine it with point scores for other evaluation factors generally should not be used. Point scores can be helpful in summarizing subjective evaluation of technical and other factors, but are not needed in evaluating cost or price and tend to obscure the tradeoff between cost/price and other factors, rather than clarifying it. If point scoring of cost/price is utilized, it should be demonstrated that the value of a cost/price point is comparable, in value to the Government, to the value of a non-cost/price point."

By formerfed on Thursday, September 05, 2002 - 08:45 am:

K.C.,

One further GAO quote that's relevant - "Numerical point scores, while useful as guides to decision-making, do not of themselves supply the basis and reason for the award decision."

My old agency had much success on some procurements using an approach we termed "succinct narrative." It was patterned after NASA's mid-range streamlined approach, which I believe is in their regulations.

It requires the development of unique discriminators specific for each individual procurement. You need to forget about all the routine factors commonly used for most procurements and get at the question "what are the distinguishing attributes among offeror products or services that impact how successful the product or service is deployed?" Then proposal evaluation consists of writing concise wording that describes how each offeror meets the factors. There are no point scores, adjectival ratings, or summary scores. It's just a summary wording of the benefits, weaknesses, and deficiencies in very brief terms.

The good thing about this is the award selection decision is often more readily apparent. It gets the evaluators to focus on what the procurement is all about and the purpose te product or service is used for. Finally it really goes to the core of what separates one offer from another.

By Anonymous on Thursday, September 05, 2002 - 09:36 am:

Vern,

I like your 3 step/factor evaluation scheme. Coincidentally, it is similar to the one I used when I proposed to my wife, except in item 2 where risk was directly proportional to the amount of prior experience.

Anonymous (for obvious reasons)

By K.C. on Thursday, September 05, 2002 - 10:05 am:

Thanks again, everyone, for the great advice, links, and citations.

Vern -- Interesting approach for evaluation (acceptability, risk, and cost). I'd love to give it a whirl. I am of the general opinion that a winning proposal has nothing to do with successful contract execution (with the possible exception of incumbent sources). At the end of the day, if the Govt incorporates technical approach etc, undoubtedly it locks in a solution that was developed by industry without enough information ... and if they don't incorporate the technical approach into the contract, why bother listening to it. The other problem is that in many cases the Government does not maintain the technical expertise to develop a solid requirements package ... and therefore how will evaluators know when a proposal is sound? Rhetorical questions ...

Have you used this simplified approach, or seen it demonstrated with success? On a major system?

By Vern Edwards on Friday, September 06, 2002 - 09:03 am:

K.C.:

The approach, with variations, has been used in a number of source selections. It was used last year by the U.S. Customs Service to award the multi-billion dollar Customs Modernization contract. They added "understanding of the requirement" and "key personnel" as subfactors under risk. They evaluated "understanding" based on oral presentations. I helped them write the source selection plan and the RFP proposal preparation instructions and evaluation factors for award.

By Vern Edwards on Friday, September 06, 2002 - 09:25 am:

K.C.:

The Customs RFP is still on the Web at http://www.customs.ustreas.gov/contract/modern/rfp/rfp.pdf. Read Section M, which you will find at the very end of the RFP.

The evaluation scheme is a little more complicated than the basic scheme that I described above, but it included only four main evaluation factors -- offer acceptability, offeror capability, risk and price -- which is very simple for such a large procurement. It also included the "contingency procedure" that I have advocated in past postings.

The oral presentations were longer than I would have liked -- 4.5 to 5 hours -- but shorter than they had originally wanted. They included a one hour formal presentation and allowed 3 to 4 hours for a Q&A session. This was understandable in light of what was at stake for them. It's a huge task order contract to develop an enterprise-wide IT system for the agency. The contract performance period, with options, continues through the year 2011.

By K.C. on Friday, September 06, 2002 - 11:42 am:

Vern,

Thanks for the link. Would be interesting to compare this source selection experience with other enterprise IT system source selections (Navy Marine Corps Intranet, SOCOM's Enterprise Information Technology Contract, Army Infostructure, etc etc).

I like the contingency procedure -- puts some teeth into the statement that's always there about "reserving the right" to limit for efficiency.

Regarding terms and conditions ... Wonder how the "acceptability" part would play out in a FAR 12 acquisition -- where negotiation of terms and conditions is generally acceptable (and expected)?

Also -- what was the point of the professional employees compensation plan??? To see if customs could rely on key personnel availability throughout the contract?

Thanks again for the link.

By Vern Edwards on Friday, September 06, 2002 - 12:05 pm:

K.C.:

Yes, the acceptability criterion could work in a commercial items procurement, although you might not want to use it if you're buying supplies.

Professional employee compensation is a mandatory evaluation factor in certain services acquisitions. See FAR § 22.1103. Customs was very concerned that the contractor maintain a high quality IT staff.