Higher Learning Commission

Higher Learning Commission

  •  | 
  • Policies  | 
  • Section 2: Compliance With Federal Regulation  | 
  • Assignment of Credits (FDCR.A.10.020)

HLC policy

Policy Title: Assignment of Credits, Program Length and Tuition

Number: fdcr.a.10.020.

An institution shall be able to equate its learning experiences with semester or quarter credit hours using practices common to institutions of higher education, to justify the lengths of its programs in comparison to similar programs found in accredited institutions of higher education, and to justify any program-specific tuition in terms of program costs, program length, and program objectives. Institutions shall notify HLC of any significant changes in the relationships among credits, program length, and tuition.

Assignment of Credit Hours . The institution’s assignment and award of credit hours shall conform to commonly accepted practices in higher education. Those institutions seeking, or participating in, Title IV federal financial aid, shall demonstrate that they have policies determining the credit hours awarded to courses and programs in keeping with commonly-accepted practices in higher education and with any federal definition of the credit hour, as may appear in federal regulations and that institutions also have procedures that result in an appropriate awarding of institutional credit in conformity with the policies established by the institution.

HLC Review. HLC shall review an institution’s compliance with this policy in conjunction with a comprehensive evaluation for Candidacy, Initial Accreditation or Reaffirmation of Accreditation during HLC’s assurance process. Institutions shall also produce evidence of compliance with this policy upon demand in accordance with HLC policy. HLC may sample or use other techniques to review selected institutional programs to ensure that it has reviewed the reliability and accuracy of the institution’s assignment of credit. HLC shall monitor, through its established monitoring processes, the resolution of any concerns related to an institution’s compliance with this policy as identified during that evaluation and shall require that an institution remedy any deficiency in this regard by a date certain but not to exceed two years from the date of the action identifying the deficiency.

HLC Action for Systemic Noncompliance. In addition to taking appropriate action related to the institution’s compliance with the Federal Compliance Requirements, HLC shall notify the Secretary of Education if, following any review process identified above or through any other mechanism, HLC finds systemic noncompliance with HLC’s policies in this section regarding the awarding of academic credit.

HLC shall understand systemic noncompliance to mean that an institution lacks policies to determine the appropriate awarding of academic credit or that there is an awarding by an institution of institutional credit across multiple programs or divisions or affecting significant numbers of students not in conformity with the policies established by the institution or with commonly accepted practices in higher education.

Policy History

Last Revised: November 2020 First Adopted: February 1996 Revision History: Adopted February 1996, effective September 1996; revised November 2011; revised and combined with policies 3.10, 3.10(a), 3.10b), and 3.10(c) June 2012; revised June 2019, effective September 1, 2019; revised November 2020 Notes: Former policy number 4.0(a). In February 2021, references to the Higher Learning Commission as “the Commission” were replaced with the term “HLC.”

The Higher Learning Commission word mark is a registered trademark owned by the Higher Learning Commission.

230 South LaSalle Street, Suite 7-500, Chicago, IL 60604

800.621.7440 / 312.263.0456

YouTube

  •  Contact Us
  •  Privacy Notice

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

  • Search Search Please fill out this field.

What Is Credit?

Credit in lending and borrowing, other definitions of credit, what is a letter of credit, what is a credit limit, what is a line of credit, what is revolving credit, the bottom line.

  • Credit & Debt
  • Definitions A - M

Credit: What It Is and How It Works

meaning of assignment credit

Skylar Clarine is a fact-checker and expert in personal finance with a range of experience including veterinary technology and film studies.

meaning of assignment credit

The word "credit" has many meanings in the financial world, but it most commonly refers to a contractual agreement in which a borrower receives a sum of money or something else of value and commits to repaying the lender at a later date, typically with interest.

Credit can also refer to the creditworthiness or credit history of an individual or a company—as in "she has good credit." In the world of accounting, it refers to a specific type of bookkeeping entry.

Key Takeaways

  • Credit is typically defined as an agreement between a lender and a borrower.
  • Credit can also refer to an individual's or a business's creditworthiness.
  • In accounting, a credit is a type of bookkeeping entry, the opposite of which is a debit.

Investopedia / Sydney Saporito

Credit represents an agreement between a creditor (lender) and a borrower (debtor). The debtor promises to repay the lender, often with interest, or risk financial or legal penalties. Extending credit is a practice that goes back thousands of years, to the dawn of human civilization, according to the anthropologist David Graeber in his book Debt: The First 5000 Years.

There are many different forms of credit. Common examples include car loans, mortgages, personal loans, and lines of credit. Essentially, when the bank or other financial institution makes a loan, it "credits" money to the borrower, who must pay it back at a future date.

Credit cards may be the most ubiquitous example of credit today, allowing consumers to purchase just about anything on credit. The card-issuing bank serves as an intermediary between buyer and seller, paying the seller in full while extending credit to the buyer, who may repay the debt over time while incurring interest charges until it is fully paid off.

Similarly, if buyers receive products or services from a seller who doesn't require payment until later, that is a form of credit. For example, when a restaurant receives a truckload of produce from a wholesaler who will bill the restaurant for it a month later, the wholesaler is providing the restaurant owner with a form of credit.

"Credit" is also used as shorthand to describe the financial soundness of businesses or individuals. Someone who has good or excellent credit is considered less of a risk to lenders than someone with bad or poor credit.

Credit scores are one way that individuals are classified in terms of risk, not only by prospective lenders but also by insurance companies and, in some cases, landlords and employers. For example, the commonly used FICO score ranges from 300 to 850. Anyone with a score of 800 or higher is considered to have exceptional credit, 740 to 799 represents very good credit, 670 to 739 is good credit, 580 to 669 is fair, and a score of 579 or less is poor.

Companies are also judged by credit rating agencies , such as Moody's and Standard and Poor's, and given letter-grade scores, representing the agency's assessment of their financial strength. Those scores are closely watched by bond investors and can affect how much interest companies will have to offer in order to borrow money. Similarly, government securities are graded based on whether the issuing government or government agency is considered to have solid credit. U.S. Treasuries, for example, are backed by "full faith and credit of the United States."

In the world of accounting, "credit" has a more specialized meaning. It refers to a bookkeeping entry that records a decrease in assets or an increase in liabilities (as opposed to a debit , which does the opposite). For example, suppose that a retailer buys merchandise on credit. After the purchase, the company's inventory account increases by the amount of the purchase (via a debit), adding an asset to the company's balance sheet. However, its accounts payable field also increases by the amount of the purchase (via a credit), adding a liability.

Often used in international trade, a letter of credit is a letter from a bank guaranteeing that a seller will receive the full amount that it is due from a buyer by a certain agreed-upon date. If the buyer fails to do so, the bank is on the hook for the money.

A credit limit represents the maximum amount of credit that a lender (such as a credit card company) will extend (such as to a credit card holder). Once the borrower reaches the limit they are unable to make further purchases until they repay some portion of their balance. The term is also used in connection with lines of credit and buy now, pay later loans .

A line of credit refers to a loan from a bank or other financial institution that makes a certain amount of credit available to the borrower for them to draw on as needed, rather than taking all at once. One type is the home equity line of credit (HELOC) , which allows owners to borrow against the value of their home for renovations or other purposes.

Revolving credit involves a loan with no fixed end date—a credit card account being a good example. As long as the account is in good standing, the borrower can continue to borrow against it, up to whatever credit limit has been established. As the borrower makes payments toward the balance, the account is replenished. These kinds of loans are often referred to open-end credit . Mortgages and car loans, by contrast, are considered closed-end credit because they come to an end on a certain date.

The word "credit" has multiple meanings in personal and business finance. Most often it refers to the ability to buy a good or service and pay for it at some future point. Credit may be arranged directly between a buyer and seller or with the assistance of an intermediary, such as a bank or other financial institution. Credit serves a vital purpose in making the world of commerce run smoothly.

Experian. " What Is a Good Credit Score? "

  • Accounting Explained With Brief History and Modern Job Requirements 1 of 51
  • What Is the Accounting Equation, and How Do You Calculate It? 2 of 51
  • What Is an Asset? Definition, Types, and Examples 3 of 51
  • Liability: Definition, Types, Example, and Assets vs. Liabilities 4 of 51
  • Equity Definition: What it is, How It Works and How to Calculate It 5 of 51
  • Revenue Definition, Formula, Calculation, and Examples 6 of 51
  • Expense: Definition, Types, and How Expenses Are Recorded 7 of 51
  • Current Assets vs. Noncurrent Assets: What's the Difference? 8 of 51
  • What Is Accounting Theory in Financial Reporting? 9 of 51
  • Accounting Principles Explained: How They Work, GAAP, IFRS 10 of 51
  • Accounting Standard Definition: How It Works 11 of 51
  • Accounting Convention: Definition, Methods, and Applications 12 of 51
  • What Are Accounting Policies and How Are They Used? With Examples 13 of 51
  • How Are Principles-Based and Rules-Based Accounting Different? 14 of 51
  • What Are Accounting Methods? Definition, Types, and Example 15 of 51
  • What Is Accrual Accounting, and How Does It Work? 16 of 51
  • Cash Accounting Definition, Example & Limitations 17 of 51
  • Accrual Accounting vs. Cash Basis Accounting: What's the Difference? 18 of 51
  • Financial Accounting Standards Board (FASB): Definition and How It Works 19 of 51
  • Generally Accepted Accounting Principles (GAAP): Definition, Standards and Rules 20 of 51
  • What Are International Financial Reporting Standards (IFRS)? 21 of 51
  • IFRS vs. GAAP: What's the Difference? 22 of 51
  • How Does US Accounting Differ From International Accounting? 23 of 51
  • Cash Flow Statement: What It Is and Examples 24 of 51
  • Breaking Down The Balance Sheet 25 of 51
  • Income Statement: How to Read and Use It 26 of 51
  • What Does an Accountant Do? 27 of 51
  • Financial Accounting Meaning, Principles, and Why It Matters 28 of 51
  • How Does Financial Accounting Help Decision-Making? 29 of 51
  • Corporate Finance Definition and Activities 30 of 51
  • How Financial Accounting Differs From Managerial Accounting 31 of 51
  • Cost Accounting: Definition and Types With Examples 32 of 51
  • Certified Public Accountant: What the CPA Credential Means 33 of 51
  • What Is a Chartered Accountant (CA) and What Do They Do? 34 of 51
  • Accountant vs. Financial Planner: What's the Difference? 35 of 51
  • Auditor: What It Is, 4 Types, and Qualifications 36 of 51
  • Audit: What It Means in Finance and Accounting, and 3 Main Types 37 of 51
  • Tax Accounting: Definition, Types, vs. Financial Accounting 38 of 51
  • Forensic Accounting: What It Is, How It's Used 39 of 51
  • Chart of Accounts (COA) Definition, How It Works, and Example 40 of 51
  • What Is a Journal in Accounting, Investing, and Trading? 41 of 51
  • Double Entry: What It Means in Accounting and How It's Used 42 of 51
  • Debit: Definition and Relationship to Credit 43 of 51
  • Credit: What It Is and How It Works 44 of 51
  • Closing Entry 45 of 51
  • What Is an Invoice? It's Parts and Why They Are Important 46 of 51
  • 6 Components of an Accounting Information System (AIS) 47 of 51
  • Inventory Accounting: Definition, How It Works, Advantages 48 of 51
  • Last In, First Out (LIFO): The Inventory Cost Method Explained 49 of 51
  • The FIFO Method: First In, First Out 50 of 51
  • Average Cost Method: Definition and Formula with Example 51 of 51

meaning of assignment credit

  • Terms of Service
  • Editorial Policy
  • Privacy Policy
  • Your Privacy Choices

Book cover

Encyclopedia of Machine Learning and Data Mining pp 294–298 Cite as

Credit Assignment

  • Claude Sammut 3  
  • Reference work entry
  • First Online: 01 January 2017

279 Accesses

Structural credit assignment ; Temporal credit assignment

When a learning system employs a complex decision process, it must assign credit or blame for the outcomes to each of its decisions. Where it is not possible to directly attribute an individual outcome to each decision, it is necessary to apportion credit and blame between each of the combinations of decisions that contributed to the outcome. We distinguish two cases in the credit assignment problem. Temporal credit assignment refers to the assignment of credit for outcomes to actions. Structural credit assignment refers to the assignment of credit for actions to internal decisions. The first subproblem involves determining when the actions that deserve credit were taken and the second involves assigning credit to the internal structure of actions (Sutton  1984 ).

Consider the problem of learning to balance a pole that is hinged on a cart (Michie and Chambers  1968 ; Anderson and Miller  1991 ). The cart...

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Recommended Reading

Albus JS (1975) A new approach to manipulator control: the cerebellar model articulation controller (CMAC). J Dyn Syst Measur Control Trans ASME 97(3):220–227

Article   MATH   Google Scholar  

Anderson CW, Miller WT (1991) A set of challenging control problems. In: Miller W, Sutton RS, Werbos PJ (eds) Neural networks for control. MIT Press, Cambridge

Google Scholar  

Atkeson C, Schaal S, Moore A (1997) Locally weighted learning. AI Rev 11:11–73

Banerjee B, Liu Y, Youngblood GM (eds) (2006) Proceedings of the ICML workshop on “structural knowledge transfer for machine learning, Pittsburgh

Barto A, Sutton R, Anderson C (1983) Neuron-like adaptive elements that can solve difficult learning control problems. IEEE Trans Syst Man Cybern SMC-13:834–846

Article   Google Scholar  

Benson S, Nilsson NJ (1995) Reacting, planning and learning in an autonomous agent. In: Furukawa K, Michie D, Muggleton S (eds) Machine intelligence, vol 14. Oxford University Press, Oxford

Bertsekas DP, Tsitsiklis J (1996) Neuro-dynamic programming. Athena Scientific, Nashua

MATH   Google Scholar  

Caruana R (1997) Multitask learning. Mach Learn 28:41–75

Dejong G, Mooney R (1986) Explanation-based learning: an alternative view. Mach Learn 1:145–176

Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley Longman Publishing, Boston

Grefenstette JJ (1988) Credit assignment in rule discovery systems based on genetic algorithms. Mach Learn 3(2–3):225–245

Hinton G, Rumelhart D, Williams R (1985) Learning internal representation by back-propagating errors. In: Rumelhart D, McClelland J, Group TPR (eds) Parallel distributed computing: explorations in the microstructure of cognition, vol 1. MIT Press, Cambridge, pp 31–362

Holland J (1986) Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems. In: Michalski RS, Carbonell JG, Mitchell TM (eds) Machine learning: an artificial intelligence approach, vol 2. Morgan Kaufmann, Los Altos

Laird JE, Newell A, Rosenbloom PS (1987) SOAR: an architecture for general intelligence. Artif Intell 33(1):1–64

Mahadevan S (2009) Learning representation and control in Markov decision processes: new frontiers. Found Trends Mach Learn 1(4):403–565

Michie D, Chambers R (1968) Boxes: an experiment in adaptive control. In: Dale E, Michie D (eds) Machine intelligence, vol 2. Oliver and Boyd, Edinburgh

Minsky M (1961) Steps towards artificial intelligence. Proc IRE 49:8–30

Article   MathSciNet   Google Scholar  

Mitchell TM, Keller RM, Kedar-Cabelli ST (1986) Explanation based generalisation: a unifying view. Mach Learn 1:47–80

Mitchell TM, Utgoff PE, Banerji RB (1983) Learning by experimentation: acquiring and refining problem-solving heuristics. In: Michalski R, Carbonell J, Mitchell T (eds) Machine kearning: an artificial intelligence approach. Tioga, Palo Alto

Moore AW (1990) Efficient memory-based learning for robot control. Ph.D. thesis, UCAM-CL-TR-209, Computer Laboratory, University of Cambridge, Cambridge

Niculescu-mizil A, Caruana R (2007) Inductive transfer for Bayesian network structure learning. In: Proceedings of the 11th international conference on AI and statistics (AISTATS 2007), San Juan

Reid MD (2004) Improving rule evaluation using multitask learning. In: Proceedings of the 14th international conference on inductive logic programming, Porto, pp 252–269

Reid MD (2007) DEFT guessing: using inductive transfer to improve rule evaluation from limited data. Ph.D. thesis, School of Computer Science and Engineering, The University of New South Wales, Sydney

Rosenblatt F (1962) Principles of neurodynamics: perceptrons and the theory of Brain mechanics. Spartan Books, Washington, DC

Samuel A (1959) Some studies in machine learning using the game of checkers. IBM J Res Develop 3(3):210–229

Silver D, Bakir G, Bennett K, Caruana R, Pontil M, Russell S et al (2005) NIPS workshop on “inductive transfer: 10 years later”, Whistler

Sutton R (1984) Temporal credit assignment in reinforcement learning. Ph.D. thesis, Department of Computer and Information Science, University of Massachusetts, Amherst

Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT Press, Cambridge

Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J Mach Learn Res 10:1633–1685

MathSciNet   MATH   Google Scholar  

Wang X, Simon HA, Lehman JF, Fisher DH (1996) Learning planning operators by observation and practice. In: Proceedings of the second international conference on AI planning systems (AIPS-94), Chicago, pp 335–340

Watkins C (1989) Learning with delayed rewards. Ph.D. thesis, Psychology Department, University of Cambridge, Cambridge

Watkins C, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292

Download references

Author information

Authors and affiliations.

The University of New South Wales, Sydney, NSW, Australia

Claude Sammut

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

Faculty of Information Technology, Monash University, Melbourne, VIC, Australia

Geoffrey I. Webb

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this entry

Cite this entry.

Sammut, C. (2017). Credit Assignment. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_185

Download citation

DOI : https://doi.org/10.1007/978-1-4899-7687-1_185

Published : 14 April 2017

Publisher Name : Springer, Boston, MA

Print ISBN : 978-1-4899-7685-7

Online ISBN : 978-1-4899-7687-1

eBook Packages : Computer Science Reference Module Computer Science and Engineering

Share this entry

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

AI ALIGNMENT FORUM AF

The credit assignment problem.

This post is eventually about partial agency . However, it's been a somewhat tricky point for me to convey; I take the long route. Epistemic status: slightly crazy.

I've occasionally said "Everything boils down to credit assignment problems."

What I really mean is that credit assignment pops up in a wide range of scenarios, and improvements to credit assignment algorithms have broad implications. For example:

  • When politics focuses on (re-)electing candidates based on their track records, it's about credit assignment. The practice is sometimes derogatorily called "finger pointing", but the basic computation makes sense: figure out good and bad qualities via previous performance, and vote accordingly.
  • When politics instead focuses on policy, it is still (to a degree) about credit assignment. Was raising the minimum wage responsible for reduced employment? Was it responsible for improved life outcomes? Etc.
  • Money acts as a kind of distributed credit-assignment algorithm, and questions of how to handle money, such as how to compensate employees, often involve credit assignment.
  • In particular, mechanism design (a subfield of economics and game theory) can often be thought of as a credit-assignment problem.
  • Both criminal law and civil law involve concepts of fault and compensation/retribution -- these at least resemble elements of a credit assignment process.
  • The distributed computation which determines social norms involves a heavy element of credit assignment: identifying failure states and success states, determining which actions are responsible for those states and who is responsible, assigning blame and praise.
  • Evolution can be thought of as a (relatively dumb) credit assignment algorithm.
  • Justice, fairness, contractualism, issues in utilitarianism.
  • Bayesian updates are a credit assignment algorithm, intended to make high-quality hypotheses rise to the top.
  • Beyond the basics of Bayesianism, building good theories realistically involves identifying which concepts are responsible for successes and failures . This is credit assignment.

Another big area which I'll claim is "basically credit assignment" is artificial intelligence.

In the 1970s, John Holland kicked off the investigation of learning classifier systems . John Holland had recently invented the Genetic Algorithms paradigm, which applies an evolutionary paradigm to optimization problems. Classifier systems were his attempt to apply this kind of "adaptive" paradigm (as in "complex adaptive systems") to cognition. Classifier systems added an economic metaphor to the evolutionary one; little bits of thought paid each other for services rendered. The hope was that a complex ecology+economy could develop, solving difficult problems.

One of the main design issues for classifier systems is the virtual economy -- that is, the credit assignment algorithm. An early proposal was the bucket-brigade algorithm. Money is given to cognitive procedures which produce good outputs. These procedures pass reward back to the procedures which activated them, who similarly pass reward back in turn. This way, the economy supports chains of useful procedures.

Unfortunately, the bucket-brigade algorithm was vulnerable to parasites. Malign cognitive procedures could gain wealth by activating useful procedures without really contributing anything. This problem proved difficult to solve. Taking the economy analogy seriously, we might want cognitive procedures to decide intelligently who to pay for services. But, these are supposed to be itty bitty fragments of our thought process. Deciding how to pass along credit is a very complex task. Hence the need for a pre-specified solution such as bucket-brigade.

The difficulty of the credit assignment problem lead to a split in the field. Kenneth de Jong and Stephanie Smith founded a new approach, "Pittsburgh style" classifier systems. John Holland's original vision became "Michigan style".

Pittsburgh style classifier systems evolve the entire set of rules, rather than trying to assign credit locally. A set of rules will stand or fall together, based on overall performance. This abandoned John Holland's original focus on online learning. Essentially, the Pittsburgh camp went back to plain genetic algorithms, albeit with a special representation.

(I've been having some disagreements with Ofer , in which Ofer suggests that genetic algorithms are relevant to my recent thoughts on partial agency, and I object on the grounds that the phenomena I'm interested in have to do with online learning, rather than offline. In my imagination, arguments between the Michigan and Pittsburgh camps would have similar content. I'd love to be a fly on the wall for those old debates. to see what they were really like.)

You can think of Pittsburg-vs-Michigan much like raw Bayes updates vs belief propagation in Bayes nets. Raw Bayesian updates operate on whole hypotheses . Belief propagation instead makes a lot of little updates which spread around a network, resulting in computational efficiency at the expense of accuracy. Except Michigan-style systems didn't have the equivalent of belief propagation: bucket-brigade was a very poor approximation.

Ok. That was then, this is now. Everyone uses gradient descent these days. What's the point of bringing up a three-decade-old debate about obsolete paradigms in AI?

Let's get a little more clarity on the problem I'm trying to address.

What Is Credit Assignment?

I've said that classifier systems faced a credit assignment problem. What does that mean, exactly?

The definition I want to use for this essay is:

  • you're engaged in some sort of task;
  • you use some kind of strategy, which can be broken into interacting pieces (such as a set of rules, a set of people, a neural network, etc);
  • you receive some kind of feedback about how well you're doing (such as money, loss-function evaluations, or a reward signal);
  • you want to use that feedback to adjust your strategy.

So, credit assignment is the problem of turning feedback into strategy improvements.

Michigan-style systems tried to do this locally , meaning, individual itty-bitty pieces got positive/negative credit, which influenced their ability to participate, thus adjusting the strategy. Pittsburg-style systems instead operated globally , forming conclusions about how the overall set of cognitive structures performed. Michigan-style systems are like organizations trying to optimize performance by promoting people who do well and giving them bonuses, firing the incompetent, etc. Pittsburg-style systems are more like consumers selecting between whole corporations to give business to, so that ineffective corporations go out of business.

(Note that this is not the typical meaning of global-vs-local search that you'll find in an AI textbook.)

In practice, two big innovations made the Michigan/Pittsburgh debate obsolete: backprop, and Q-learning. Backprop turned global feedback into local, in a theoretically sound way. Q-learning provided a way to assign credit in online contexts. In the light of history, we could say that the Michigan/Pittsburgh distinction conflated local-vs-global with online-vs-offline. There's no necessary connection between those two; online learning is compatible with assignment of local credit.

I think people generally understand the contribution of backprop and its importance. Backprop is essentially the correct version of what bucket-brigade was overtly trying to do: pass credit back along chains. Bucket-brigade wasn't quite right in how it did this, but backprop corrects the problems.

So what's the importance of Q-learning? I want to discuss that in more detail.

The Conceptual Difficulty of 'Online Search'

In online learning, you are repeatedly producing outputs of some kind (call them "actions") while repeatedly getting feedback of some kind (call it "reward"). But, you don't know how to associate particular actions (or combinations of actions) with particular rewards. I might take the critical action at time 12, and not see the payoff until time 32.

In offline learning, you can solve this with a sledgehammer: you can take the total reward over everything, with one fixed internal architecture. You can try out different internal architectures and see how well each do.

Basically, in offline learning, you have a function you can optimize. In online learning, you don't.

Backprop is just a computationally efficient way to do hillclimbing search, where we repeatedly look for small steps which improve the overall fitness. But how do you do this if you don't have a fitness function? This is part of the gap between selection vs control : selection has access to an evaluation function; control processes do not have this luxury.

Q-learning and other reinforcement learning (RL) techniques provide a way to define the equivalent of a fitness function for online problems, so that you can learn.

Models to the Rescue

So, how can be associate rewards with actions?

One approach is to use a model.

Consider the example of firing employees. A corporation gets some kind of feedback about how it is doing, such as overall profit. However, there's often a fairly detailed understanding of what's driving those figures:

  • Low profit won't just be a mysterious signal which must be interpreted; a company will be able to break this down into more specific issues such as low sales vs high production costs.
  • There's some understanding of product quality, and how that relates to sales. A company may have a good idea of which product-quality issues it needs to improve, if poor quality is impacting sales.
  • There's a fairly detailed understanding of the whole production line, including which factors may impact product quality or production expenses. If a company sees problems, it probably also has a pretty good idea of which areas they're coming from.
  • There are external factors, such as economic conditions, which may effect sales without indicating anything about the quality of the company's current strategy. Thus, our model may sometimes lead us to ignore feedback.

So, models allow us to interpret feedback signals, match these to specific aspects of our strategy, and adapt strategies accordingly.

Q-learning makes an assumption that the state is fully observable, amongst other assumptions.

Naturally, we would like to reduce the strengths of the assumptions we have to make as much as we can. One way is to look at increasingly rich model classes. AIXI uses all computable models. But maybe "all computable models" is still too restrictive; we'd like to get results without assuming a grain of truth . (That's why I am not really discussing Bayesian models much in this post; I don't want to assume a grain of truth.) So we back off even further, and use logical induction or InfraBayes. Ok, sure.

But wouldn't the best way be to try to learn without models at all? That way, we reduce our "modeling assumptions" to zero.

After all, there's something called "model-free learning", right?

Model-Free Learning Requires Models

How does model-free learning work? Well, often you work with a simulable environment, which means you can estimate the quality of a policy by running it many times, and use algorithms such as policy-gradient to learn. This is called "model free learning" because the learning part of the algorithm doesn't try to predict the consequences of actions; you're just learning which action to take. From our perspective here, though, this is 100% cheating; you can only learn because you have a good model of the environment.

Moreover, model-free learning typically works by splitting up tasks into episodes. An episode is a period of time for which we assume rewards are self-enclosed, such as a single playthru of an Atari game, a single game of Chess or Go, etc. This approach doesn't solve a detailed reward-matching problem, attributing reward to specific actions; instead it relies on a course reward-matching. Nonetheless, it's a rather strong assumption: an animal learning about an environment can't separate its experience into episodes which aren't related to each other. Clearly this is a "model" in the sense of a strong assumption about how specific reward signals are associated with actions.

Part of the problem is that most reinforcement learning (RL) researchers aren't really interested in getting past these limitations. Simulable environments offer the incredible advantage of being able to learn very fast, by simulating far more iterations than could take place in a real environment. And most tasks can be reasonably reduced to episodes.

However, this won't do as a model of intelligent agency in the wild. Neither evolution nor the free market divide thing into episodes. (No, "one lifetime" isn't like "one episode" here -- that would only be the case if total reward due to actions taken in that lifetime could be calculated, EG, as total number of offspring. This would ignore inter-generational effects like parenting and grandparenting, which improve reproductive fitness of offspring at a cost in total offspring.)

What about more theoretical models of model-free intelligence?

Idealized Intelligence

AIXI is the gold-standard theoretical model of arbitrarily intelligent RL, but it's totally model-based. Is there a similar standard for model-free RL?

The paper Optimal Direct Policy Search by Glasmachers and Schmidhuber (henceforth, ODPS) aims to do for model-free learning what AIXI does for model-based learning. Where AIXI has to assume that there's a best computable model of the environment , ODPS instead assumes that there's a computable best policy . It searches through the policies without any model of the environment, or any planning.

I would argue that their algorithm is incredibly dumb, when compared to AIXI:

The basic simple idea of our algorithm is a nested loop that simultaneously makes the following quantities tend to infinity: the number of programs considered, the number of trials over which a policy is averaged, the time given to each program. At the same time, the fraction of trials spent on exploitation converges towards 1.

In other words, it tries each possible strategy, tries them for longer and longer, interleaved with using the strategy which worked best even longer than that.

Basically, we're cutting things into episodes again, but we're making the episodes longer and longer, so that they have less and less to do with each other, even though they're not really disconnected. This only works because ODPS makes an ergodicity assumption: the environments are assumed to be POMDPs which eventually return to the same states over and over, which kind of gives us an "effective episode length" after which the environment basically forgets about what you did earlier.

In contrast, AIXI makes no ergodicity assumption.

So far, it seems like we either need (a) some assumption which allows us to match rewards to actions, such as an episodic assumption or ergodicity; or, (b) a more flexible model-learning approach, which separately learns a model and then applies the model to solve credit-assignment.

Is this a fundamental obstacle?

I think a better attempt is Schmidhuber's On Learning How to Learn Learning Strategies , in which a version of policy search is explored in which parts of the policy-search algorithm are considered part of the policy (ie, modified over time). Specifically, the policy controls the episode boundary; the system is supposed to learn how often to evaluate policies. When a policy is evaluated, its average reward is compared to the lifetime average reward. If it's worse, we roll back the changes and proceed starting with the earlier strategy.

(Let's pause for a moment and imagine an agent like this. If it goes through a rough period in life, its response is to get amnesia , rolling back all cognitive changes to a point before the rough period began.)

This approach doesn't require an episodic or ergodic environment. We don't need things to reliably return to specific repeatable experiments. Instead, it only requires that the environment rewards good policies reliably enough that those same policies can set a long enough evaluation window to survive.

The assumption seems pretty general, but certainly not necessary for rational agents to learn. There are some easy counterexamples where this system behaves abysmally. For example, we can take any environment and modify it by subtracting the time t from the reward, so that reward becomes more and more negative over time. Schmidhuber's agent becomes totally unable to learn in this setting. AIXI would have no problem.

Unlike the ODPS paper, I consider this to be progress on the AI credit assignment problem. Yet, the resulting agent still seems importantly less rational than model-based frameworks such as AIXI.

Actor-Critic

Let's go back to talking about things which RL practitioners might really use.

First, there are some forms of RL which don't require everything to be episodic.

One is actor-critic learning . The "actor" is the policy we are learning. The "critic" is a learned estimate of how good things are looking given the history. IE, we learn to estimate the expected value -- not just the next reward, but the total future discounted reward.

Unlike the reward, the expected value solves the credit assignment for us. Imagine we can see the "true" expected value. If we take an action and then the expected value increases, we know the action was good (in expectation). If we take an action and expected value decreases, we know it was bad (in expectation).

So, actor-critic works by (1) learning to estimate the expected value; (2) using the current estimated expected value to give feedback to learn a policy.

What I want to point out here is that the critic still has "model" flavor. Actor-critic is called "model-free" because nothing is explicitly trained to anticipate the sensory observations, or the world-state. However, the critic is learning to predict; it's just that all we need to predict is expected value.

Policy Gradient

In the comments to the original version of this post, policy gradient methods were mentioned as a type of model-free learning which doesn't require any models even in this loose sense, IE, doesn't require simulable environments or episodes. I was surprised to hear that it doesn't require episodes. (Most descriptions of it do assume episodes, since practically speaking most people use episodes.) So are policy-gradient methods the true "model-free" credit assignment algorithm we seek?

As far as I understand, policy gradient works on two ideas:

  • Rather than correctly associating rewards with actions, we can associate a reward with all actions which came before it. Good actions will still come out on top in expectation . The estimate is just a whole lot noisier than it otherwise might be.
  • We don't really need a baseline to interpret reward against. I naively thought that when you see a sequence of rewards, you'd be in the dark about whether the sequence was "good" or "bad", so you wouldn't know how to generate a gradient. ("We earned 100K this quarter; should we punish or reward our CEO?") It turns out this isn't technically a show-stopper. Considering the actions actually taken, we move in their direction proportion to the reward signal. ("Well, let's just give the CEO some fraction of the 100K; we don't know whether they deserve the bonus, but at least this way we're creating the right incentives.") This might end up reinforcing bad actions, but those tugs in different directions are just noise which should eventually cancel out. When they do, we're left with the signal: the gradient we wanted. So, once again, we see that this just introduces more noise without fundamentally compromising our ability to follow the gradient.

So one way to understand the policy-gradient theorem is: we can follow the gradient even when we can't calculate the gradient! Even when we sometimes get its direction totally turned around! We only need to ensure we follow it in expectation , which we can do without even knowing which pieces of feedback to think of as a good sign or a bad sign.

RL people reading this might have a better description of policy-gradient; please let me know if I've said something incorrect.

Anyway, are we saved? Does this provide a truly assumption-free credit assignment algorithm?

It obviously assumes linear causality, with future actions never responsible for past rewards. I won't begrudge it that assumption.

Besides that, I'm somewhat uncertain. The explanations of the policy-gradient theorem I found don't focus on deriving it in the most general setting possible, so I'm left guessing which assumptions are essential. Again, RL people, please let me know if I say something wrong.

However, it looks to me like it's just as reliant on the ergodicity assumption as the ODPS thing we looked at earlier. For gradient estimates to average out and point us in the right direction, we need to get into the same situation over and over again.

I'm not saying real life isn't ergodic (quantum randomness suggests it is), but mixing times are so long that you'd reach the heat death of the universe by the time things converge (basically by definition). By that point, it doesn't matter.

I still want to know if there's something like "the AIXI of model-free learning"; something which appears as intelligent as AIXI, but not via explicit model-learning.

Where Updates Come From

Here begins the crazier part of this post. This is all intuitive/conjectural.

Claim: in order to learn, you need to obtain an "update"/"gradient", which is a direction (and magnitude) you can shift in which is more likely than not an improvement.

Claim: predictive learning gets gradients "for free" -- you know that you want to predict things as accurately as you can, so you move in the direction of whatever you see . With Bayesian methods, you increase the weight of hypotheses which would have predicted what you saw; with gradient-based methods, you get a gradient in the direction of what you saw (and away from what you didn't see).

Claim: if you're learning to act, you do not similarly get gradients "for free":

  • You don't know which actions, or sequences of actions, to assign blame/credit. This is unlike the prediction case, where we always know which predictions were wrong.
  • You don't know what the alternative feedback would have been if you'd done something different. You only get the feedback for the actions you chose. This is unlike the case for prediction, where we're rewarded for closeness to the truth. Changing outputs to be more like what was actually observed is axiomatically better, so we don't have to guess about the reward of alternative scenarios.
  • As a result, you don't know how to adjust your behavior based on the feedback received . Even if you can perfectly match actions to rewards, because we don't know what the alternative rewards would have been, we don't know what to learn: are actions like the one I took good, or bad?

(As discussed earlier, the policy gradient theorem does actually mitigate these three points, but apparently at the cost of an ergodicity assumption, plus much noisier gradient estimates.)

Claim: you have to get gradients from a source that already has gradients . Learning-to-act works by splitting up the task into (1) learning to anticipate expected value, and perhaps other things; (2) learning a good policy via the gradients we can get from (1).

What it means for a learning problem to "have gradients" is just that the feedback you get tells you how to learn. Predictive learning problems (supervised or unsupervised) have this; they can just move toward what's observed. Offline problems have this; you can define one big function which you're trying to optimize. Learning to act online doesn't have this, however, because it lacks counterfactuals.

The Gradient Gap

(I'm going to keep using the terms 'gradient' and 'update' in a more or less interchangeable way here; this is at a level of abstraction where there's not a big distinction.)

I'm going to call the "problem" the gradient gap. I want to call it a problem, even though we know how to "close the gap" via predictive learning (whether model-free or model-based). The issue with this solution is only that it doesn't feel elegant. It's weird that you have to run two different backprop updates (or whatever learning procedures you use); one for the predictive component, and another for the policy. It's weird that you can't "directly" use feedback to learn to act.

Why should we be interested in this "problem"? After all, this is a basic point in decision theory: to maximize utility under uncertainty, you need probability.

One part of it is that I want to scrap classical ("static") decision theory and move to a more learning-theoretic ("dynamic") view. In both AIXI and logical-induction based decision theories, we get a nice learning-theoretic foundation for the epistemics (solomonoff induction/logical induction), but, we tack on a non-learning decision-making unit on top. I have become skeptical of this approach. It puts the learning into a nice little box labeled "epistemics" and then tries to make a decision based on the uncertainty which comes out of the box. I think maybe we need to learn to act in a more fundamental fashion.

A symptom of this, I hypothesize, is that AIXI and logical induction DT don't have very good learning-theoretic properties. [ AIXI's learning problems ; LIDT's learning problems. ] You can't say very much to recommend the policies they learn, except that they're optimal according to the beliefs of the epistemics box -- a fairly trivial statement, given that that's how you decide what action to take in the first place.

Now, in classical decision theory, there's a nice picture where the need for epistemics emerges nicely from the desire to maximize utility. The complete class theorem starts with radical uncertainty (ie, non-quantitative), and derives probabilities from a willingness to take pareto improvements. That's great! I can tell you why you should have beliefs, on pragmatic grounds! What we seem to have in machine learning is a less nice picture, in which we need epistemics in order to get off the ground, but can't justify the results without circular reliance on epistemics.

So the gap is a real issue -- it means that we can have nice learning theory when learning to predict, but we lack nice results when learning to act.

This is the basic problem of credit assignment. Evolving a complex system, you can't determine which parts to give credit to success/failure (to decide what to tweak) without a model. But the model is bound to be a lot of the interesting part! So we run into big problems, because we need "interesting" computations in order to evaluate the pragmatic quality/value of computations, but we can't get interesting computations to get ourselves started, so we need to learn...

Essentially, we seem doomed to run on a stratified credit assignment system, where we have an "incorruptible" epistemic system (which we can learn because we get those gradients "for free"). We then use this to define gradients for the instrumental part.

A stratified system is dissatisfying, and impractical. First, we'd prefer a more unified view of learning. It's just kind of weird that we need the two parts. Second, there's an obstacle to pragmatic/practical considerations entering into epistemics. We need to focus on predicting important things; we need to control the amount of processing power spent; things in that vein. But (on the two-level view) we can't allow instrumental concerns to contaminate epistemics! We risk corruption! As we saw with bucket-brigade, it's easy for credit assignment systems to allow parasites which destroy learning.

A more unified credit assignment system would allow those things to be handled naturally, without splitting into two levels; as things stand, any involvement of pragmatic concerns in epistemics risks the viability of the whole system.

Tiling Concerns & Full Agency

From the perspective of full agency (ie, the negation of partial agency ), a system which needs a protected epistemic layer sounds suspiciously like a system that can't tile . You look at the world, and you say: "how can I maximize utility?" You look at your beliefs, and you say: "how can I maximize accuracy?" That's not a consequentialist agent; that's two different consequentialist agents! There can only be one king on the chessboard; you can only serve one master; etc.

If it turned out we really really need two-level systems to get full agency, this would be a pretty weird situation. "Agency" would seem to be only an illusion which can only be maintained by crippling agents and giving them a split-brain architecture where an instrumental task-monkey does all the important stuff while an epistemic overseer supervises. An agent which "breaks free" would then free itself of the structure which allowed it to be an agent in the first place.

On the other hand, from a partial-agency perspective, this kind of architecture could be perfectly natural. IE, if you have a learning scheme from which total agency doesn't naturally emerge, then there isn't any fundamental contradiction in setting up a system like this.

Part of the (potentially crazy) claim here is that having models always gives rise to some form of myopia. Even logical induction, which seems quite unrestrictive, makes LIDT fail problems such as ASP , making it myopic according to the second definition of my previous post . (We can patch this with LI policy selection , but for any particular version of policy selection, we can come up with decision problems for which it is "not updateless enough".) You could say it's myopic "across logical time", whatever that means .

If it were true that "learning always requires a model" (in the sense that learning-to-act always requires either learning-to-predict or hard-coded predictions), and if it were true that "models always give rise to some form of myopia", then this would confirm my conjecture in the previous post (that no learning scheme incentivises full agency).

This is all pretty out there; I'm not saying I believe this with high probability.

Evolution & Evolved Agents

Evolution is a counterexample to this view: evolution learns the policy "directly" in essentially the way I want. This is possible because evolution "gets the gradients for free" just like predictive learning does: the "gradient" here is just the actual reproductive success of each genome.

Unfortunately, we can't just copy this trick. Artificial evolution requires that we decide how to kill off / reproduce things, in the same way that animal breeding requires breeders to decide what they're optimizing for. This puts us back at square one; IE, needing to get our gradient from somewhere else.

Does this mean the "gradient gap" is a problem only for artificial intelligence, not for natural agents? No. If it's true that learning to act requires a 2-level system, then evolved agents would need a 2-level system in order to learn within their lifespan; they can't directly use the gradient from evolution, since it requires them to die.

Also, note that evolution seems myopic. (This seems complicated, so I don't want to get into pinning down exactly in which senses evolution is myopic here.) So, the case of evolution seems compatible with the idea that any gradients we can actually get are going to incentivize myopic solutions.

Similar comments apply to markets vs firms .

a system which needs a protected epistemic layer sounds suspiciously like a system that can't tile

I stand as a counterexample: I personally want my epistemic layer to have accurate beliefs—y'know, having read the sequences… :-P

I think of my epistemic system like I think of my pocket calculator: a tool I use to better achieve my goals. The tool doesn't need to share my goals.

The way I think about it is:

  • Early in training, the AGI is too stupid to formulate and execute a plan to hack into its epistemic level.
  • Late in training, we can hopefully get to the place where the AGI's values, like mine, involve a concept of "there is a real world independent of my beliefs", and its preferences involve the state of that world, and therefore "get accurate beliefs" becomes instrumentally useful and endorsed.
In between … well … in between, we're navigating treacherous waters …
Second, there's an obstacle to pragmatic/practical considerations entering into epistemics. We need to focus on predicting important things; we need to control the amount of processing power spent; things in that vein. But (on the two-level view) we can't allow instrumental concerns to contaminate epistemics! We risk corruption!

I mean, if the instrumental level has any way whatsoever to influence the epistemic level, it will be able to corrupt it with false beliefs if it's hell-bent on doing so, and if it's sufficiently intelligent and self-aware. But remember we're not protecting against a superintelligent adversary; we're just trying to "navigate the treacherous waters" I mentioned above. So the goal is to allow what instrumental influence we can on the epistemic system, while making it hard and complicated to outright corrupt the epistemic system. I think the things that human brains do for that are:

  • The instrumental level gets some influence over what to look at, where to go, what to read, who to talk to, etc.
  • There's a trick ( involving acetylcholine ) where the instrumental level has some influence over a multiplier on the epistemic level's gradients (a.k.a. learning rate). So epistemic level is always updates towards "more accurate predictions on this frame", but it updates infinitesimally in situations where prediction accuracy is instrumentally useless, and it updates strongly in situations where prediction accuracy is instrumentally important.
  • There's a different mechanism that creates the same end result as #2: namely, the instrumental level has some influence over what memories get replayed more or less often.
  • For #2 and #3, the instrumental level has some influence but not complete influence. There are other hardcoded algorithms running in parallel and flagging certain things as important, and the instrumental level has no straightforward way to prevent that from happening. 

Right, I basically agree with this picture. I might revise it a little:

  • Early, the AGI is too dumb to hack its epistemics (provided we don't give it easy ways to do so!).
  • In the middle, there's a danger zone.
  • When the AGI is pretty smart, it sees why one should be cautious about such things, and it also sees why any modifications should probably be in pursuit of truthfulness (because true beliefs are a convergent instrumental goal) as opposed to other reasons.
  • When the AGI is really smart, it might see better ways of organizing itself (eg, specific ways to hack epistemics which really are for the best even though they insert false beliefs), but we're OK with that, because it's really freaking smart and it knows to be cautious and it still thinks this.
So the goal is to allow what instrumental influence we can on the epistemic system, while making it hard and complicated to outright corrupt the epistemic system.

One important point here is that the epistemic system probably knows what the instrumental system is up to . If so, this gives us an important lever. For example, in theory, a logical inductor can't be reliably fooled by an instrumental reasoner who uses it (so long as the hardware, including the input channels, don't get corrupted), because it would know about the plans and compensate for them.

So if we could get a strong guarantee that the epistemic system knows what the instrumental system is up to (like "the instrumental system is transparent to the epistemic system"), this would be helpful.

Shapley Values [thanks Zack for reminding me of the name] are akin to credit assignment: you have a bunch of agents coordinating to achieve something, and then you want to assign payouts fairly based on how much each contribution mattered to the final outcome.

And the way you do this is, for each agent you look at how good the outcome would have been if everybody except that agent had coordinated, and then you credit each agent proportionally to how much the overall performance would have fallen off without them.

So what about doing the same here- send rewards to each contributor proportional to how much they improved the actual group decision (assessed by rerunning it without them and seeing how performance declines)?

I can't for the life of me remember what this is called

Shapley value

(Best wishes, Less Wrong Reference Desk)

Yeah, it's definitely related. The main thing I want to point out is that Shapley values similarly require a model in order to calculate. So you have to distinguish between the problem of calculating a detailed distribution of credit and being able to assign credit "at all" -- in artificial neural networks, backprop is how you assign detailed credit, but a loss function is how you get a notion of credit at all. Hence, the question "where do gradients come from?" -- a reward function is like a pile of money made from a joint venture; but to apply backprop or Shapley value, you also need a model of counterfactual payoffs under a variety of circumstances. This is a problem, if you don't have a seperate "epistemic" learning process to provide that model -- ie, it's a problem if you are trying to create one big learning algorithm that does everything.

Specifically, you don't automatically know how to

send rewards to each contributor proportional to how much they improved the actual group decision

because in the cases I'm interested in, ie online learning, you don't have the option of

rerunning it without them and seeing how performance declines

-- because you need a model in order to rerun.

But, also, I think there are further distinctions to make. I believe that if you tried to apply Shapley value to neural networks, it would go poorly; and presumably there should be a "philosophical" reason why this is the case (why Shapley value is solving a different problem than backprop). I don't know exactly what the relevant distinction is.

(Or maybe Shapley value works fine for NN learning; but, I'd be surprised.)

Removing things entirely seems extreme. How about having a continuous "contribution parameter," where running the algorithm without an element would correspond to turning this parameter down to zero, but you could also set the parameter to 0.5 if you wanted that element to have 50% of the influence it has right now. Then you can send rewards to elements if increasing their contribution parameter would improve the decision.

Removing things entirely seems extreme.

Dropout is a thing, though.

Dropout is like the converse of this - you use dropout to assess the non-outdropped elements. This promotes resiliency to perturbations in the model - whereas if you evaluate things by how bad it is to break them, you could promote fragile, interreliant collections of elements over resilient elements.

I think the root of the issue is that this Shapley value doesn't distinguish between something being bad to break, and something being good to have more of. If you removed all my blood I would die, but that doesn't mean that I would currently benefit from additional blood.

Anyhow, the joke was that as soon as you add a continuous parameter, you get gradient descent back again.

Unfortunately, we can't just copy this trick. Artificial evolution requires that we decide how to kill off / reproduce things, in the same way that animal breeding requires breeders to decide what they're optimizing for. This puts us back at square one; IE, needing to get our gradient from somewhere else.
Suppose we have a good reward function (as is typically assumed in deep RL). We can just copy the trick in that setting, right? But the rest of the post makes it sound like you still think there's a problem, in that even with that reward, you don't know how to assign credit to each individual action. This is a problem that evolution also has; evolution seemed to manage it just fine.
(Similarly, even if you think actor-critic methods don't count, surely REINFORCE is one-level learning? It works okay; added bells and whistles like critics are improvements to its sample efficiency.)

Yeah, I pretty strongly think there's a problem -- not necessarily an insoluble problem, but, one which has not been convincingly solved by any algorithm which I've seen. I think presentations of ML often obscure the problem (because it's not that big a deal in practice -- you can often define good enough episode boundaries or whatnot).

  • Yeah, I feel like "matching rewards to actions is hard" is a pretty clear articulation of the problem.
  • I agree that it should be surprising, in some sense, that getting rewards isn't enough. That's why I wrote a post on it! But why do you think it should be enough? How do we "just copy the trick"??
  • I don't agree that this is analogous to the problem evolution has. If evolution just "received" the overall population each generation, and had to figure out which genomes were good/bad based on that, it would be a more analogous situation. However, that's not at all the case. Evolution "receives" a fairly rich vector of which genomes were better/worse, each generation. The analogous case for RL would be if you could output several actions each step, rather than just one, and receive feedback about each. But this is basically "access to counterfactuals"; to get this, you need a model.

No, definitely not, unless I'm missing something big.

From page 329 of this draft of Sutton & Barto :

Note that REINFORCE uses the complete return from time t, which includes all future rewards up until the end of the episode. In this sense REINFORCE is a Monte Carlo algorithm and is well defined only for the episodic case with all updates made in retrospect after the episode is completed (like the Monte Carlo algorithms in Chapter 5). This is shown explicitly in the boxed on the next page.

So, REINFORCE "solves" the assignment of rewards to actions via the blunt device of an episodic assumption; all rewards in an episode are grouped with all actions during that episode. If you expand the episode to infinity (so as to make no assumption about episode boundaries), then you just aren't learning. This means it's not applicable to the case of an intelligence wandering around and interacting dynamically with a world, where there's no particular bound on how the past may relate to present reward.

The "model" is thus extremely simple and hardwired, which makes it seem one-level. But you can't get away with this if you want to interact and learn on-line with a really complex environment.

Also, since the episodic assumption is a form of myopia, REINFORCE is compatible with the conjecture that any gradients we can actually construct are going to incentivize some form of myopia.

Oh, I see. You could also have a version of REINFORCE that doesn't make the episodic assumption, where every time you get a reward, you take a policy gradient step for each of the actions taken so far, with a weight that decays as actions go further back in time. You can't prove anything interesting about this, but you also can't prove anything interesting about actor-critic methods that don't have episode boundaries, I think. Nonetheless, I'd expect it would somewhat work, in the same way that an actor-critic method would somewhat work. (I'm not sure which I expect to work better; partly it depends on the environment and the details of how you implement the actor-critic method.)

(All of this said with very weak confidence; I don't know much RL theory)

You could also have a version of REINFORCE that doesn't make the episodic assumption, where every time you get a reward, you take a policy gradient step for each of the actions taken so far, with a weight that decays as actions go further back in time. You can't prove anything interesting about this, but you also can't prove anything interesting about actor-critic methods that don't have episode boundaries, I think.

Yeah, you can do this. I expect actor-critic to work better, because your suggestion is essentially a fixed model which says that actions are more relevant to temporally closer rewards (and that this is the only factor to consider).

I'm not sure how to further convey my sense that this is all very interesting. My model is that you're like "ok sure" but don't really see why I'm going on about this.

Yeah, I think this is basically right. For the most part though, I'm trying to talk about things where I disagree with some (perceived) empirical claim, as opposed to the overall "but why even think about these things" -- I am not surprised when it is hard to convey why things are interesting in an explicit way before the research is done.

Here, I was commenting on the perceived claim of "you need to have two-level algorithms in order to learn at all; a one-level algorithm is qualitatively different and can never succeed", where my response is "but no, REINFORCE would do okay, though it might be more sample-inefficient". But it seems like you aren't claiming that, just claiming that two-level algorithms do quantitatively but not qualitatively better.

Actually, that wasn't what I was trying to say. But, now that I think about it, I think you're right.

I was thinking of the discounting variant of REINFORCE as having a fixed, but rather bad, model associating rewards with actions: rewards are tied more with actions nearby. So I was thinking of it as still two-level, just worse than actor-critic.

But, although the credit assignment will make mistakes (a predictable punishment which the agent can do nothing to avoid will nonetheless make any actions leading up to the punishment less likely in the future), they should average out in the long run (those 'wrongfully punished' actions should also be 'wrongfully rewarded'). So it isn't really right to think it strongly depends on the assumption.

Instead, it's better to think of it as a true discounting function. IE, it's not as assumption about the structure of consequences; it's an expression of how much the system cares about distant rewards when taking an action. Under this interpretation, REINFORCE indeed "closes the gradient gap" -- solves the credit assignment problem w/o restrictive modeling assumptions.

Maybe. It might also me argued that REINFORCE depends on some properties of the environment such as ergodicity. I'm not that familiar with the details.

But anyway, it now seems like a plausible counterexample.

One part of it is that I want to scrap classical (“static”) decision theory and move to a more learning-theoretic (“dynamic”) view.

Can you explain more what you mean by this, especially "learning-theoretic"? I've looked at learning theory a bit and the typical setup seems to involve a loss or reward that is immediately observable to the learner, whereas in decision theory, utility can be over parts of the universe that you can't see and therefore can't get feedback from, so it seems hard to apply typical learning theory results to decision theory. I wonder if I'm missing the whole point though... What do you think are the core insights or ideas of learning theory that might be applicable to decision theory?

(I don't speak for Abram but I wanted to explain my own opinion.) Decision theory asks, given certain beliefs an agent has, what is the rational action for em to take. But, what are these "beliefs"? Different frameworks have different answers for that. For example, in CDT a belief is a causal diagram. In EDT a belief is a joint distribution over actions and outcomes. In UDT a belief might be something like a Turing machine (inside the execution of which the agent is supposed to look for copies of emself). Learning theory allows us to gain insight through the observation that beliefs must be learnable , otherwise how would the agent come up with these beliefs in the first place? There might be parts of the beliefs that come from the prior and cannot be learned, but still, at least the type signature of beliefs should be compatible with learning.

Moreover, decision problems are often implicitly described from the point of view of a third party. For example, in Newcomb's paradox we postulate that Omega can predict the agent, which makes perfect sense for an observer looking from the side, but might be difficult to formulate from the point of view of the agent itself. Therefore, understanding decision theory requires the translation of beliefs from the point of view of one observer to the point of view of another. Here also learning theory can help us: we can ask, what are the beliefs Alice should expect Bob to learn given particular beliefs of Alice about the world? From a slightly different angle, the central source of difficulty in decision theory is the notion of counterfactuals, and the attempt to prescribe particular meaning to them, which different decision theories do differently. Instead, we can just postulate that, from the subjective point of view of the agent, counterfactuals are ontologically basic. The agent believes emself to have free will, so to speak. Then, the interesting quesiton is, what kind of counterfactuals are produced by the translation of beliefs from the perspective of a third party to the perspective of the given agent.

Indeed, thinking about learning theory led me to the notion of quasi-Bayesian agents (agents that use incomplete/fuzzy models ), and quasi-Bayesian agents automatically solve all Newcomb-like decision problems. In other words, quasi-Bayesian agents are effectively a rigorous version of UDT.

Incidentally, to align AI we literally need to translate beliefs from the user's point of view to the AI's point of view. This is also solved via the same quasi-Bayesian approach. In particular, this translation process preserves the "point of updatelessness", which, in my opinion, is the desired result (the choice of this point is subjective).

My thinking is somewhat similar to Vanessa's . I think a full explanation would require a long post in itself. It's related to my recent thinking about UDT and commitment races . But, here's one way of arguing for the approach in the abstract.

You once asked :

Assuming that we do want to be pre-rational, how do we move from our current non-pre-rational state to a pre-rational one? This is somewhat similar to the question of how do we move from our current non-rational (according to ordinary rationality) state to a rational one. Expected utility theory says that we should act as if we are maximizing expected utility, but it doesn't say what we should do if we find ourselves lacking a prior and a utility function (i.e., if our actual preferences cannot be represented as maximizing expected utility).
The fact that we don't have good answers for these questions perhaps shouldn't be considered fatal to pre-rationality and rationality, but it's troubling that little attention has been paid to them, relative to defining pre-rationality and rationality. (Why are rationality researchers more interested in knowing what rationality is, and less interested in knowing how to be rational? Also, BTW, why are there so few rationality researchers? Why aren't there hordes of people interested in these issues?)

My contention is that rationality should be about the update process . It should be about how you adjust your position. We can have abstract rationality notions as a sort of guiding star, but we also need to know how to steer based on those.

Some examples:

  • Logical induction can be thought of as the result of performing this transform on Bayesianism; it describes belief states which are not coherent, and gives a rationality principle about how to approach coherence -- rather than just insisting that one must somehow approach coherence.
  • Evolutionary game theory is more dynamic than the Nash story. It concerns itself more directly with the question of how we get to equilibrium. Strategies which work better get copied. We can think about the equilibria, as we do in the Nash picture; but, the evolutionary story also lets us think about non-equilibrium situations. We can think about attractors (equilibria being point-attractors, vs orbits and strange attractors), and attractor basins; the probability of ending up in one basin or another; and other such things.
  • However, although the model seems good for studying the behavior of evolved creatures, there does seem to be something missing for artificial agents learning to play games; we don't necessarily want to think of there as being a population which is selected on in that way.
  • The complete class theorem describes utility-theoretic rationality as the end point of taking Pareto improvements. But, we could instead think about rationality as the process of taking Pareto improvements. This lets us think about (semi-)rational agents whose behavior isn't described by maximizing a fixed expected utility function, but who develop one over time. (This model in itself isn't so interesting, but we can think about generalizing it; for example, by considering the difficulty of the bargaining process -- subagents shouldn't just accept any Pareto improvement offered.)
  • Again, this model has drawbacks. I'm definitely not saying that by doing this you arrive at the ultimate learning-theoretic decision theory I'd want.

Promoted to curated: It's been a while since this post has come out, but I've been thinking of the "credit assignment" abstraction a lot since then, and found it quite useful. I also really like the way the post made me curious about a lot of different aspects of the world, and I liked the way it invited me to boggle at the world together with you. 

I also really appreciated your long responses to questions in the comments, which clarified a lot of things for me. 

One thing comes to mind for maybe improving the post, though I think that's mostly a difference of competing audiences: 

I think some sections of the post end up referencing a lot of really high-level concepts, in a way that I think is valuable as a reference, but also in a way that might cause a lot of people to bounce off of it (even people with a pretty strong AI Alignment background). I can imagine a post that includes very short explanations of those concepts, or moves them into a context where they are more clearly marked as optional (since I think the post stands well without at least some of those high-level concepts)

I think I have juuust enough background to follow the broad strokes of this post, but not to quite grok the parts I think Abram was most interested in. 

I definitely caused me to think about credit assignment. I actually ended up thinking about it largely through the lens of Moral Mazes (where challenges of credit assignment combine with other forces to create a really bad environment). Re-reading this post, while I don't quite follow everything, I do successfully get a taste of how credit assignment fits into a bunch of different domains.

For the "myopia/partial-agency" aspects of the post, I'm curious how Abram's thinking has changed. This post AFAICT was a sort of "rant". A year after the fact, did the ideas here feel like they panned out?

It does seem like someone should someday write a post about credit assignment that's a bit more general. 

Most of my points from my curation notice still hold. And two years later, I am still thinking a lot about credit assignment as a perspective on many problems I am thinking about. 

This seems like one I would significantly re-write for the book if it made it that far. I feel like it got nominated for the introductory material, which I wrote quickly in order to get to the "main point" (the gradient gap). A better version would have discussed credit assignment algorithms more.

From the perspective of full agency (ie, the negation of partial agency), a system which needs a protected epistemic layer sounds suspiciously like a system that can't tile. You look at the world, and you say: "how can I maximize utility?" You look at your beliefs, and you say: "how can I maximize accuracy?" That's not a consequentialist agent; that's two different consequentialist agents!

For reinforcement learning with incomplete/fuzzy hypotheses, this separation doesn't exist, because the update rule for fuzzy beliefs depends on the utility function and in some sense even on the actual policy.

How does that work?

Actually I was somewhat confused about what the right update rule for fuzzy beliefs is when I wrote that comment. But I think I got it figured out now.

First, background about fuzzy beliefs:

Let E be the space of environments (defined as the space of instrumental states in Definition 9 here ). A fuzzy belief is a concave function ϕ : E → [ 0 , 1 ] s.t. sup ϕ = 1 . We can think of it as the membership function of a fuzzy set. For an incomplete model Φ ⊆ E , the corresponding ϕ is the concave hull of the characteristic function of Φ (i.e. the minimal concave ϕ s.t. ϕ ≥ χ Φ ).

Let γ be the geometric discount parameter and U ( γ ) : = ( 1 − γ ) ∑ ∞ n = 0 γ n r n be the utility function. Given a policy π (EDIT: in general, we allow our policies to explicitly depend on γ ), the value of π at ϕ is defined by

V π ( ϕ , γ ) : = 1 + inf μ ∈ E ( E μ π [ U ( γ ) ] − ϕ ( μ ) )

The optimal policy and the optimal value for ϕ are defined by

π ∗ ϕ , γ : = arg max π V π ( ϕ , γ ) V ( ϕ , γ ) : = max π V π ( ϕ , γ )

Given a policy π , the regret of π at ϕ is defined by

R g π ( ϕ , γ ) : = V ( ϕ , γ ) − V π ( ϕ , γ )

π is said to learn ϕ when it is asymptotically optimal for ϕ when γ → 1 , that is

lim γ → 1 R g π ( ϕ , γ ) = 0

Given ζ a probability measure over the space fuzzy hypotheses, the Bayesian regret of π at ζ is defined by

B R g π ( ζ , γ ) : = E ϕ ∼ ζ [ R g π ( ϕ , γ ) ]

π is said to learn ζ when

lim γ → 1 B R g π ( ζ , γ ) = 0

If such a π exists, ζ is said to be learnable . Analogously to Bayesian RL, ζ is learnable if and only if it is learned by a specific policy π ∗ ζ (the Bayes-optimal policy). To define it, we define the fuzzy belief ϕ ζ by

ϕ ζ ( μ ) : = sup ( σ : s u p p ζ → E ) : E ϕ ∼ ζ [ σ ( ϕ ) ] = μ E ϕ ∼ ζ [ ϕ ( σ ( ϕ ) ) ]

We now define π ∗ ζ : = ϕ ∗ ϕ ζ .

Now, updating: (EDIT: the definition was needlessly complicated, simplified)

Consider a history h ∈ ( A × O ) ∗ or h ∈ ( A × O ) ∗ × A . Here A is the set of actions and O is the set of observations. Define μ ∗ ϕ by

μ ∗ ϕ : = arg max μ ∈ E min π ( ϕ ( μ ) − E μ π [ U ] )

Let E ′ be the space of "environments starting from h ". That is, if h ∈ ( A × O ) ∗ then E ′ = E and if h ∈ ( A × O ) ∗ × A then E ′ is slightly different because the history now begins with an observation instead of with an action.

For any μ ∈ E , ν ∈ E ′ we define [ ν ] h μ ∈ E by

[ ν ] h μ ( o ∣ h ′ ) : = { ν ( o ∣ h ′′ )  if  h ′ = h h ′′ μ ( o ∣ h ′ )  otherwise

Then, the updated fuzzy belief is

( ϕ ∣ h ) ( ν ) : = ϕ ( [ ν ] h μ ∗ ϕ ) + constant

You look at the world, and you say: "how can I maximize utility?" You look at your beliefs, and you say: "how can I maximize accuracy?" That's not a consequentialist agent; that's two different consequentialist agents!
Not... really? "how can I maximize accuracy?" is a very liberal agentification of a process that might be more drily thought of as asking "what is accurate?" Your standard sequence predictor isn't searching through epistemic pseudo-actions to find which ones best maximize its expected accuracy, it's just following a pre-made plan of epistemic action that happens to increase accuracy.
Though this does lead to the thought: if you want to put things on equal footing, does this mean you want to describe a reasoner that searches through epistemic steps/rules like an agent searching through actions/plans?

This is more or less how humans already conceive of difficult abstract reasoning. We don't solve integrals by gradient descent, we imagine doing some sort of tree search where the edges are different abstract manipulations of the integral. But for everyday reasoning, like navigating 3D space, we just use our specialized feed-forward hardware.

Yeah, I absolutely agree with this. My description that you quoted was over-dramaticizing the issue.

Really, what you have is an agent sitting on top of non-agentic infrastructure. The non-agentic infrastructure is "optimizing" in a broad sense because it follows a gradient toward predictive accuracy, but it is utterly myopic (doesn't plan ahead to cleverly maximize accuracy).

The point I was making, stated more accurately, is that you (seemingly) need this myopic optimization as a 'protected' sub-part of the agent, which the overall agent cannot freely manipulate (since if it could, it would just corrupt the policy-learning process by wireheading).

This is more or less how humans already conceive of difficult abstract reasoning.

Yeah, my observation is that it intuitively seems like highly capable agents need to be able to do that; to that end, it seems like one needs to be able to describe a framework where agents at least have that option without it leading to corruption of the overall learning process via the instrumental part strategically biasing the epistemic part to make the instrumental part look good.

(Possibly humans just use a messy solution where the strategic biasing occurs but the damage is lessened by limiting the extent to which the instrumental system can bias the epistemics -- eg, you can't fully choose what to believe.)

I found this a very interesting frame on things, and am glad I read it.

However, the critic is learning to predict; it's just that all we need to predict is expected value.

See also muZero . Also note that predicting optimal value formally ties in with predicting a model. In MDPs, you can reconstruct all of the dynamics using just | S | optimal value functions.

I re-read this post thinking about how and whether this applies to brains...

The online learning conceptual problem (as I understand your description of it) says, for example, I can never know whether it was a good idea to have read this book, because maybe it will come in handy 40 years later. Well, this seems to be "solved" in humans by exponential / hyperbolic discounting. It's not exactly episodic, but we'll more-or-less be able to retrospectively evaluate whether a cognitive process worked as desired long before death.

Relatedly, we seem to generally make and execute plans that are (hierarchically) laid out in time and with a success criterion at its end, like "I'm going to walk to the store". So we get specific and timely feedback on whether that plan was successful.

We do in fact have a model class. It seems very rich; in terms of "grain of truth", well I'm inclined to think that nothing worth knowing is fundamentally beyond human comprehension, except for contingent reasons like memory and lifespan limitations (i.e. not because they are not incompatible with the internal data structures). Maybe that's good enough?

Just some thoughts; sorry if this is irrelevant or I'm misunderstanding anything. :-)

The online learning conceptual problem (as I understand your description of it) says, for example, I can never know whether it was a good idea to have read this book, because maybe it will come in handy 40 years later. Well, this seems to be "solved" in humans by exponential / hyperbolic discounting. It's not exactly episodic, but we'll more-or-less be able to retrospectively evaluate whether a cognitive process worked as desired long before death.

I interpret you as suggesting something like what Rohin is suggesting , with a hyperbolic function giving the weights.

It seems (to me) the literature establishes that our behavior can be approximately described by the hyperbolic discounting rule (in certain circumstances anyway), but, comes nowhere near establishing that the mechanism by which we learn looks like this, and in fact has some evidence against. But that's a big topic. For a quick argument, I observe that humans are highly capable, and I generally expect actor/critic to be more capable than dumbly associating rewards with actions via the hyperbolic function. That doesn't mean humans use actor/critic; the point is that there are a lot of more-sophisticated setups to explore.

We do in fact have a model class.

It's possible that our models are entirely subservient to instrumental stuff (ie, we "learn to think" rather than "thinking to learn", which would mean we don't have the big split which I'm pointing to -- ie, that we solve the credit assignment problem "directly" somehow, rather than needing to learn to do so.

It seems very rich; in terms of "grain of truth", well I'm inclined to think that nothing worth knowing is fundamentally beyond human comprehension, except for contingent reasons like memory and lifespan limitations (i.e. not because they are not incompatible with the internal data structures). Maybe that's good enough?
Claim: predictive learning gets gradients "for free" ... Claim: if you're learning to act, you do not similarly get gradients "for free". You take an action, and you see results of that one action. This means you fundamentally don't know what would have happened had you taken alternate actions, which means you don't have a direction to move your policy in. You don't know whether alternatives would have been better or worse. So, rewards you observe seem like not enough to determine how you should learn.

This immediately jumped out at me as an implausible distinction because I was just reading Surfing Uncertainty which goes on endlessly about how the machinery of hierarchical predictive coding is exactly the same as the machinery of hierarchical motor control (with "priors" in the former corresponding to "priors + control-theory-setpoints" in the latter, and with "predictions about upcoming proprioceptive inputs" being identical to the muscle control outputs). Example excerpt:

the heavy lifting that is usually done by the use of efference copy, inverse models, and optimal controllers [in the models proposed by non-predictive-coding people] is now shifted [in the predictive coding paradigm] to the acquisition and use of the predictive (generative) model (i.e., the right set of prior probabilistic ‘beliefs’). This is potentially advantageous if (but only if) we can reasonably assume that these beliefs ‘emerge naturally as top-down or empirical priors during hierarchical perceptual inference’ (Friston, 2011a, p. 492). The computational burden thus shifts to the acquisition of the right set of priors (here, priors over trajectories and state transitions), that is, it shifts the burden to acquiring and tuning the generative model itself. --Surfing Uncertainty chapter 4

I'm a bit hazy on the learning mechanism for this (confusingly-named) "predictive model" (I haven't gotten around to chasing down the references) and how that relates to what you wrote... But it does sorta sound like it entails one update process rather than two...

Yep, I 100% agree that this is relevant. The PP/Friston/free-energy/active-inference camp is definitely at least trying to "cross the gradient gap" with a unified theory as opposed to a two-system solution. However, I'm not sure how to think about it yet.

  • I may be completely wrong, but I have a sense that there's a distinction between learning and inference which plays a similar role; IE, planning is just inference, but both planning and inference work only because the learning part serves as the second "protected layer"??
  • It may be that the PP is "more or less" the Bayesian solution; IE, it requires a grain of truth to get good results, so it doesn't really help with the things I'm most interested in getting out of "crossing the gap".
  • Note that PP clearly tries to implement things by pushing everything into epistemics. On the other hand, I'm mostly discussing what happens when you try to smoosh everything into the instrumental system. So many of my remarks are not directly relevant to PP.
  • I get the sense that Friston might be using the "evolution solution" I mentioned; so, unifying things in a way which kind of lets us talk about evolved agents, but not artificial ones. However, this is obviously an oversimplification, because he does present designs for artificial agents based on the ideas.

Overall, my current sense is that PP obscures the issue I'm interested in more than solves it, but it's not clear.

meaning of assignment credit

Legal Due Diligence: Beware of Credit Assignments

The assignment contracts.

Credit assignment is the contract by which the creditor (assignor) transfers its right to credit to a third party (assignee), who will collect it from the debtor (assigned).

This is an increasingly common practice, involving not only banking institutions but also commercial companies that, in turn, also assign receivables to their suppliers.

Audits in DD

Right from the Due Diligence (DD) phase aimed at the acquisition of company shares/shares or companies and business units, it is essential to pay attention to the issue of credit assignment and assess whether the payment flow is in accordance with the agreements made, notifications, acceptances and exchange of information carried out between the parties involved. Verification is necessary, among other reasons, to prevent:

  • (a) the obligor who has paid erroneously (e.g., payment in favor of the assignor despite notification or acceptance of the assignment) is forced to pay the same amount again in favor of the assignee;
  • (b) the assigning creditor collects sums not due to it and, (i) on the one hand, is forced to return the sum to the assignee or the debtor (if the latter has in the meantime repeated the sum in favor of the assignee) and, (ii) on the other hand, may be subject to actions for damages by the debtor and/or the assignee.
  • (c) the transferee creditor is forced to “chase” the debtor who has paid erroneously or an assignor creditor who has collected an undue sum.

Verifications must be conducted based on legal regulations.

>>  READ LDP ARTICLES TO STAY UPDATED

It is noted, for example, that:

There are limits to assignment in that credits that are strictly personal in nature, credits whose transfer is prohibited by law or credits whose assignability is expressly excluded by the parties (for example, a specific procedure is provided for the assignment of credits with public administrations) cannot be transferred. It is therefore a good idea to check the nature of the assigned receivable to avoid incurring null and void or ineffective assignment contracts in the recourse assignment mode, the assignor must guarantee that the debtor will perform the due performance. In the event of default, the assignee may turn to the assignor, who in turn is required to pay the amount owed by the debtor.

Therefore, in the presence of this mode of assignment, to ensure full release of the assignor, it will have to be verified that the assigned debtor has paid its debt to the assignee in full.

The assignment of credit is consensual in nature and, therefore, is perfected by the agreement reached between the assignor and assignee, not by acceptance or notification to the assigned debtor.

Therefore, to exclude actions for breach of contract on the part of the assignee, it is appropriate for the assignor himself to give timely notice of the assignment to the assigned debtor.

There is no peremptory deadline for notifying the assigned debtor of the assignment, which can be done in any form suitable for the purpose, even a notice sent by registered mail is sufficient (and even a verbal notice).

The law also does not identify the party required to notify the assignment of the claim. In this regard, it should be remembered that prior to acceptance or notification there is a presumption of good faith on the part of the assigned debtor that it has fulfilled its obligations to the assignor, with the consequence that payment made in favor of the assignor itself is fully dischargeable. The assignee can overcome this presumption by proving, by any means (including witnesses), that the assigned debtor was in any case aware of the assignment.

Therefore, to prevent the assigned debtor from being forced to repeat (in favor of the assignee) the payment already made (in favor of the assignor) it will be necessary not only to evaluate the exchange of written information but also to screen, possibly through targeted interviews with apicals, the verbal information acquired from the assigned debtor. The difficulty of such a verification is well evident; therefore, it would always be advisable to provide written notification of the assignment of credit.

In the presence of credit assignment contracts, it is therefore a good idea to dedicate within the DD check list (aimed at the acquisition of company shares/shares or companies and business units), a specific section referring to credit assignments involving the client company as assignor, assignee, and transferee.

>>  FIND OUT MORE ABOUT LDP SERVICES

Newsletter tax february 2024, newsletter tax febbraio 2024, ldp corporate finance as financial advisor to palingeo s.p.a. in its listing on euronext growth milan , ldp corporate finance è il financial advisor di palingeo s.p.a. nella quotazione su euronext growth milan.

Information

Subscribe To OUR NEWSLETTER

Stay Connected with Us. Subscribe for latest LDP News and Communications.

LDP Tax & Law | VAT N. IT10824440159 | Via M. Buonarroti, 39 – 20145, Milan.

LDP provides Tax, Law and payroll  scalable and customised services and solutions. LDP Professional have also matured a significant expertise in  M&A, Corporate Finance, Transfer Price, Global Mobility Consultancy and Process Automation. 

Sign up to our newsletter

Help & Advice

  • Debt help in Scotland
  • What are the fees involved?
  • How to avoid further debt
  • How debt affects your benefits
  • How debt affects your home
  • How debt affects your credit score
  • Joint and inherited debt
  • Debt collection help and advice
  • Sheriff Officer help and advice

Debt Solutions

  • Administration Order
  • Debt Arrangement Scheme
  • Debt Consolidation Loans
  • Debt Management Plan
  • Debt Relief Order (DRO)
  • Debt Settlement Offer
  • Sequestration
  • Process Map

Types of Debt

  • Business Debt
  • Council Tax Debt
  • Credit Card Debt
  • Festive Period
  • Gambling Debt
  • Income Tax Debt
  • Logbook Loans
  • Payday Loans
  • Secured Loans
  • Student loans
  • Utility Bills
  • Unsecured Debt
  • Customer Stories
  • Tips & Advice

Notice of Assignment: Debt Terms explained

Our team is responsible for verifying the accuracy of content as it’s created. Facts, figures, and eligibility requirements evolve over time, however, so there may be occasional oversights. We would always advise you to review the terms and conditions of any product before submitting an application.

Does a Debt Relief Order Affect My Credit Rating?

A Notification of Assignment is employed to notify debtors that a third party has ‘acquired’ their debt. The new company (assignee) assumes responsibility for the collection processes, occasionally engaging a debt collection agency to retrieve the funds on their behalf.

Maxine McCreadie

2nd May 2019

A creditors’ main goal is to lend you money and to collect it, so they’re not the biggest fan of chasing those who fall into arrears. As such, sometimes they’ll pass arrears on to other companies.

Being in debt can get confusing as it is, but especially so if a situation arises where you owe money to your mortgage lender, then a letter comes through your door from a company you’ve never heard of asking you to make payments to them instead.

This is what’s known as a Notice of Assignment (NOA) . They are sent to inform you that a third party has bought a debt that you owe from the company you borrowed it from.

If your debt is assigned to a new owner, they will then take over the previous company’s responsibility for debt collection and will sometimes hire a collection agency to work on their behalf.

Write off up to 70% of your debt – Check if you qualify

What is a notice of assignment

A Notice of Assignment, in relation to debt, is a document used to inform debtors that their debt has been ‘purchased’ by a third party.

The notice serves to notify the debtor that a new company (known as the assignee) has taken over the responsibility of collecting the debt.

This means that the debtor should direct their future payments and communications regarding the debt to the assignee instead of the original creditor. T

he assignee may choose to handle the debt collection procedures themselves or may engage a debt collection agency to recover the outstanding amount on their behalf.

Types of assignment

There are two types of assignment that a creditor can make – Legal and equitable.

Both of them fall under the Law of Property Act 1925 and both require the creditor to notify you of the change in writing.

It also isn’t possible to assign only part of a debt to a third party. If a creditor is ‘selling’ your debt, they have to sell the debt as a whole, and that debt will become one of the purchasing company’s obligations.

We set out the differences between legal and equitable assignments below.

A legal assignment gives the purchasing party the power to enforce the debt. You will also then make payments to this company instead of the original creditor.

When a debt goes through an equitable assignment, it is only the amount owed that is transferred.

In these instances, the purchasing company cannot enforce the debt and the original creditor will still retain their original rights and responsibilities.

Why do creditors sell debts?

One of the most common questions asked when a notice of assignment is received is why? Why have they sold it and how can they?

The answer is that is it is actually perfectly legal for them to sell your debt to another company.

When you sign a credit agreement there will have been a clause within the fine print. This will have stated that they are able to assign their rights to a third party .

As you have signed for this, they do not need to ask your permission to ‘sell’ the debt and you are unfortunately unable to dispute it.

The only exception to this rule is if the lender pledges to the Standards of Lending Practice and you have given evidence of mental health issues previously.

In these instances, your debt should not have been sold and you should seek advice on this.

A massive thank you

“I’d like to say a massive thank you to Carrington Dean for helping me. It feels like I have control of my life again.”

meaning of assignment credit

What does a notice of assignment mean for you?

If a creditor passes one of your debts to a third party, they will notify the credit reference agencies that they are now responsible for the collection.

The previous company’s name will be removed from your credit file and that any defaults will also be registered in their name.

Many people often find that having a debt being passed to a third party is a blessing in disguise.

The new company might be easier to deal with or be more flexible. They may offer to freeze interest on your debts, for example, giving you more scope to repay what you owe more quickly.

Ultimately, getting your debt paid off is in both yours and the creditors best interests.

Agreeing to a manageable payment plan gives you some breathing space and it can often mean they won’t need to take any further action against you.

It’s also worth noting that this also does not reset the six-year period for the debt to become statute-barred and debts that are already in this category will remain as such.

Assignment and debt collection agencies

Sometimes, the purchasing company will employ a debt collection agency to act on their behalf or the debt will be purchased by an agency themselves

They will take over the full rights to the debt and attempt to collect it from you in full.

As such, they will contact you by letter, phone calls, texts or emails. It also means that they can take further action against you should you continue to default on the account.

However, unless it is stated otherwise, debt collection agencies only work on behalf of a company.

The purchasing company will still own the debt, although some collection agencies do deal in debt purchasing also.

It’s also important to remember that although they can contact you for payment, they still have to abide by creditor etiquette.

They cannot pretend to have certain legal powers or lie to you, break data protection laws or search for you on social media.

You’ll likely find that debt collection agencies are often open to negotiations, so it is always best to contact them as soon as possible when they contact you for payment.

meaning of assignment credit

Find out if you qualify to write off up to 70% of your unsecured debt!

Assignment and debt solutions.

If you are already in some form of debt solution such as an IVA , Trust Deed or a DMP that is run privately by a company, you must notify the company running your agreement.

They will make the necessary updates to their records and contact the company to arrange payment to the new company.

If you are managing your own debts, you will need to cancel any payment to the original company and set up a new one to the purchasing company or debt collection agency.

In this instance, you may be asked to show them an up to date state of affairs in case any changes need to be made.

If you’re receiving notices of assignment and struggling with debt collection, call us today. A qualified adviser will be on hand to give you free confidential advice and help you find the right solution for your debts.

KEY TAKEAWAYS

  • Creditors often transfer arrears to other companies to handle debt collection.
  • A Notice of Assignment (NOA) informs debtors that their debt has been purchased by a third party.
  • The assignee, the new company, assumes responsibility for collecting the debt.
  • Debtors should direct future payments and communication regarding the debt to the assignee.
  • The assignee may handle debt collection internally or engage a debt collection agency.

Maxine is an experienced writer, specialising in personal insolvency. With a wealth of experience in the finance industry, she has written extensively on the subject of Individual Voluntary Arrangements, Protected Trust Deed's, and various other debt solutions.

How we reviewed this article:

Our debt experts continually monitor the personal finance and debt industry, and we update our articles when new information becomes available.

Current Version

Written by Maxine McCreadie

Edited by Ben McCormack

Edited by Maxine McCreadie

Latest Articles

Budget-friendly family fun: weekend activities that won’t break the bank

Budget-friendly family fun: weekend activities that won’t break the bank

Valentine’s Day 2024: 5 low-cost date ideas

Valentine’s Day 2024: 5 low-cost date ideas

People in work encouraged to check benefit eligibility

People in work encouraged to check benefit eligibility

meaning of assignment credit

To find out more about managing your money and getting free advice, visit Money Helper, independent service set up to help people manage their money.

Customer Information

All calls are recorded for training and monitoring purposes. This website uses cookies. By using this website, you consent to cookies being placed on your computer or any device you are using to visit this site. To find out more about managing your money and getting free advice, visit Money Helper, independent service set up to help people manage their money.

Fees & Information

There are fees associated with our services. It is important these are fully explained before entering into a debt solution. You will always find us open about these fees and how they are charged.

© 2023 The Carrington Dean Group Limited. Authorised and regulated by the Financial Conduct Authority. FCA No: 674395. Registered in Scotland, Company Registration No SC 225672. Registered Address: Regent House, 5th Floor, 76 Renfield Street, Glasgow, G2 1NQ Samantha Warburton is authorised in the UK to act as Insolvency Practitioners by the Insolvency Practitioners Association IP Number: 12430

* A debt write off amount of between 25% and 70% is realistic, however, the debt write off amount for each customer differs depending upon their individual financial circumstances and is subject to the approval of their creditors. The example provided has been achieved by 15% of Trust Deed customers in the least 12 months. Carrington Dean: provides insolvency solutions to individuals, specialising in Trust Deeds, DAS ( Debt Arrangement Scheme) and Sequestration. We do not administer or provide advice relating to debt management products, such as Debt Management Plans. Advice and information on all options will be provided following an initial fact find where the individual(s) concerned meets the criteria for a Trust Deed, DAS(Debt Arrangement Scheme) or Sequestration and wishes to pursue it further. All advice given is based on formal options available in Scotland and is therefore provided in reasonable contemplation of an appointment.

Credit Assignment to Academic Courses

Purpose and scope.

This policy establishes guidelines for assigning the number of credits earned through satisfactory fulfillment of requirements for academic courses. Reaffirming Boston University’s commitment to educational quality in terms that certify compliance with applicable government regulations and accreditation standards, the policy makes explicit the relationship between the credits assigned to an individual course and the expected work of a student completing that course. Credit assignment should be based on course-related activities regardless of how or where they take place (including online), so long as they are required and contribute materially to achievement of course objectives or program learning outcomes. Credit assignments may also consider the intensity of engagement with the faculty or subject matter, student responsibility for learning outcomes, and course-related learning taking place outside the classroom, including online. This policy articulates definitions that help to ensure a measure of consistency in the assignment of academic credit across all disciplines, while insisting that oversight of credit assignment rests with the faculty and academic administrators closest to instruction. The policy applies to all credit-bearing academic courses, regardless of course type, instructional format, mode of delivery, or length of the course.

Definitions

Faculty Instruction : Teaching or supervision of teaching carried out in a credit-bearing course by faculty or other authorized instructors, including graduate teaching assistants supervised by BU faculty.

Contact : Engagement of instructors with students to advance course objectives. Contact may take various forms: e.g., it may be face-to-face or online, synchronous or asynchronous, one-to-many or one-to-one, including faculty direction of students participating in for-credit externships or internships, clinical practicums, studios, research, or scholarship.

Scheduled contact hour : One weekly, required hour (50 minutes) or equivalent of faculty contact. In addition to class meetings reflected in the University Class Schedule, other required course activities or combinations of activities may count as scheduled contact for the purpose of assigning credit. Examples include faculty-student conferences, skills modules, and participation in online forums, film screenings, site visits, rehearsals and performances, etc. All such scheduled contact must be specified as required in course syllabuses and must contribute to a student’s grade or achievement of course objectives.

Instructors also require students to complete work outside of scheduled contact hours to fulfill course objectives. Outside work must normally include, but need not be limited to, two hours of regular weekly class preparation for each credit earned. Where expectations for the quantity and/or intellectual challenges of outside work exceed this minimum and materially increase overall student effort, the number of credits assigned to a particular course may be greater than the number of its scheduled contact hours. Examples include courses that entail extensive and/or intensive reading, writing, research, open-ended problem solving, practice-based assignments, or student responsibility for class meetings.

Course types : The following course types are covered by this policy and are aligned in the chart below with credit assignment guidelines.

  • Classroom-based: Scheduled contact occurs primarily face-to-face in a classroom setting.
  • Faculty-directed independent learning: Scheduled contact occurs via faculty supervision of students pursuing directed study for credit for such activities as capstone projects, independent work for distinction, or graduate thesis and dissertation requirements.
  • Place-or practice-based: Scheduled contact occurs in non-classroom locations such as corporations (internships), schools, or clinics.
  • Blended: Scheduled contact is a defined mixture of face-to-face and distance/online interactions.
  • Online: Scheduled contact is mediated entirely online.

Credit Assignment Guidelines

For courses offered during a typical 15-week semester, the combination of scheduled contact and independent student effort must be equivalent to at least 3 hours per week per credit hour. The guidelines should be adjusted accordingly a) for shorter courses, b) as directed by external agencies such as specialized accreditors, or c) as warranted by the standards of the discipline.

Responsible Parties

School and College faculties are responsible for assigning academic credit to individual courses, for ensuring that credit assignments meet policy guidelines, and for approving exceptions to the guidelines. Typically, this oversight will occur in the context of usual school and college processes for curriculum development and review, and within curriculum oversight bodies such as curriculum committees.

The Vice President for Enrollment & Student Affairs and the Academic Deans are responsible for ensuring implementation of the policy by all credit-granting units of the University.

The University Registrar oversees the course catalog and is responsible for reporting regularly on the status of courses vis-à-vis the Course Credit Assignment Policy to the University Provost, the Vice President for Enrollment & Student Affairs, and the Council of Deans.

Effective Date:

Effective June 1, 2015 for all new courses developed after that date. Full implementation for existing courses will be completed for the June 1, 2017 Bulletin.

Table 1: Suggested Credit Assignment Guidelines by Learning and Teaching Activity in Online and Blended Courses.

Principles:.

In addition to the principles explicitly stated in the proposed policy’s “Purpose and Scope,” the following principles were used to establish credit assignment guidelines.

  • For the foreseeable future, the credit hour will remain the standard for awarding BU credentials, reporting to external entities, and complying with federal and state regulations. Thus, the definition of a credit hour and the assignment of credit to courses must be consistent with external regulations and standards for accreditation. In addition, credit assignment policies and practices should meet or exceed the best practices at peer institutions.
  • Although the credit hour is a useful concept, its basis in face-to-face, lecture- based instruction in a classroom neither reflects the range of current practices nor acknowledges changing instructional practices, which extend beyond traditional lectures to include online and blended online or place-based courses; internships, clinical practicums, and field placements; “flipped” classrooms; and studios, laboratories, and rehearsals. Thus, credit assignment guidelines must balance the need to stipulate guidance with the need for flexibility in its application to a wide range of pedagogies.
  • Finally, the guidelines are intended to reflect the variety of pedagogies, learning outcomes, and expectations for academic effort and achievement present at BU; and, to anticipate, to the extent possible, emerging pedagogies and technologies, as well as regulatory changes. In all cases, assignment of credit to courses rests with the faculty and relevant academic governance bodies, as does oversight of compliance with policy guidelines.

This proposal was drafted by the Course Credit Definition Committee:

Co-Chairs John Straub, Professor of Chemistry, CAS Laurie Pohl, Vice President for Enrollment & Student Affairs

Lynne Allen, Professor and Director of the School of Visual Arts, CFA Jack Beermann, Professor, LAW Tobe Berkovitz, Associate Professor, COM John Caradonna, Associate Professor of Chemistry, CAS Janelle Heineke, Professor, Questrom Karen Jacobs, Clinical Professor, SAR J. Greg McDaniel, Associate Professor, ENG Anita Patterson, Professor of English, CAS L. Jay Samons, Professor of Classical Studies, CAS* Stan Sclaroff, Professor of Computer Science, CAS Adam Sweeting, Associate Professor, CGS Jeffrey von Munkwitz-­‐Smith, University Registrar, ex officio Tanya Zlateva, Associate Professor and Dean ad interim, MET

*Professor Samons resigned from the committee in summer 2014.

The policy was approved by the University Council Committee on Undergraduate Programs and Policies (UAPP) on 4/7/15; by the University Council Committee on Graduate Programs and Policies on 4/16/15 (GAPP); and by the University Council on 5/6/15.

assignment for benefit of creditors

Primary tabs.

Assignment for the benefit of the creditors (ABC)(also known as general assignment for the benefit of the creditors) is a voluntary alternative to formal bankruptcy proceedings that transfers all of the assets from a debtor to a trust for liquidating and distributing its assets. The trustee will manage the assets to pay off debt to creditors, and if any assets are left over, they will be transferred back to the debtor. 

ABC can provide many benefits to an insolvent business in lieu of bankruptcy . First, unlike in bankruptcy proceedings, the business can choose the trustee overseeing the process who might know the specifics of the business better than an appointed trustee. Second, bankruptcy proceedings can take much more time, involve more steps, and further restrict how the business is liquidated compared to an ABC which avoids judicial oversight. Thirdly, dissolving or transferring a company through an ABC often avoids the negative publicity that bankruptcy generates. Lastly, a company trying to purchase assets of a struggling company can avoid liability to unsecured creditors of the failing company. This is important because most other options would expose the acquiring business to all the debt of the struggling business. 

ABC has risen in popularity since the early 2000s, but it varies based on the state. California embraces ABC with common law oversight while many states use stricter statutory ABC structures such as Florida. Also, depending on the state’s corporate law and the company’s charter , the struggling business may be forced to get shareholder approval to use ABC which can be difficult in large corporations. 

[Last updated in June of 2021 by the Wex Definitions Team ]

  • commercial activities
  • financial services
  • business law
  • landlord & tenant
  • property & real estate law
  • trusts, inheritances & estates
  • wex definitions

IGI Global

  • Get IGI Global News
  • Language: English

US Flag

  • All Products
  • Book Chapters
  • Journal Articles
  • Video Lessons
  • Teaching Cases

Shortly You Will Be Redirected to Our Partner eContent Pro's Website

eContent Pro powers all IGI Global Author Services. From this website, you will be able to receive your 25% discount (automatically applied at checkout), receive a free quote, place an order, and retrieve your final documents .

InfoScipedia Logo

What is Credit-Assignment

Encyclopedia of Business Analytics and Optimization

Related Books View All Books

Handbook of Research on Organizational Culture Strategies for Effective Knowledge Management and Performance

Related Journals View All Journals

International Journal of Artificial Intelligence (AI) in Business and Management (IJAIBM)

What does 'DFA' mean in baseball? It's not an endearing abbreviation.

Albert Pujols . David Ortiz. Alex Rodriguez. Manny Ramirez. Nelson Cruz. Robinson Cano. Justin Upton.

Ortiz is enshrined in the Baseball Hall of Fame. Pujols is a lock for the Hall. Cruz is a future candidate for Cooperstown. And all were former major league All-Stars.

What do they all have in common?

Each of them have been DFA'd during their major league baseball career.

Ultimately, it means the player is cut from a team. It's one of several transactions that can happen to an MLB player. But it's a more common process for players who are in the latter years of their career and in the middle of a contract.

HOT STOVE UPDATES: MLB free agency: Ranking and tracking the top players available.

What does DFA mean in baseball?

Designated for assignment.

It's one of the more unique transaction types in baseball, where unlike being traded, the player is optioned to the minor leagues or simply cut from the roster.

What does being designated for assignment mean?

Teams are allowed to have 40 players on their roster, with 26 of them active on the major league roster. Over the course of the season, teams make roster moves, which sometimes involves cutting a player. In order to take someone off the 40-man roster, they must be designated for assignment.

MLB.com explains the process: "When a player's contract is designated for assignment — often abbreviated "DFA" — that player is immediately removed from his club's 40-man roster. Within seven days of the transaction (had been 10 days under the 2012-16 Collective Bargaining Agreement), the player can either be traded or placed on irrevocable outright waivers."

Can another team claim a DFA'd player?

Yes, any team can pick up a player off waivers. However, if that team claims the player, they would have to add the player to their 40-man roster.

More baseball fun facts

  • What does BB mean in baseball?
  • What does OPS mean?
  • What was the longest baseball game?
  • Who invented baseball?
  • More from M-W
  • To save this word, you'll need to log in. Log In

Definition of assignment

task , duty , job , chore , stint , assignment mean a piece of work to be done.

task implies work imposed by a person in authority or an employer or by circumstance.

duty implies an obligation to perform or responsibility for performance.

job applies to a piece of work voluntarily performed; it may sometimes suggest difficulty or importance.

chore implies a minor routine activity necessary for maintaining a household or farm.

stint implies a carefully allotted or measured quantity of assigned work or service.

assignment implies a definite limited task assigned by one in authority.

Examples of assignment in a Sentence

These examples are programmatically compiled from various online sources to illustrate current usage of the word 'assignment.' Any opinions expressed in the examples do not represent those of Merriam-Webster or its editors. Send us feedback about these examples.

Word History

see assign entry 1

14th century, in the meaning defined at sense 1

Phrases Containing assignment

  • self - assignment

Dictionary Entries Near assignment

Cite this entry.

“Assignment.” Merriam-Webster.com Dictionary , Merriam-Webster, https://www.merriam-webster.com/dictionary/assignment. Accessed 25 Feb. 2024.

Legal Definition

Legal definition of assignment, more from merriam-webster on assignment.

Nglish: Translation of assignment for Spanish Speakers

Britannica English: Translation of assignment for Arabic Speakers

Subscribe to America's largest dictionary and get thousands more definitions and advanced search—ad free!

Play Quordle: Guess all four words in a limited number of tries.  Each of your guesses must be a real 5-letter word.

Can you solve 4 words at once?

Word of the day.

See Definitions and Examples »

Get Word of the Day daily email!

Popular in Grammar & Usage

8 grammar terms you used to know, but forgot, homophones, homographs, and homonyms, commonly misspelled words, a guide to em dashes, en dashes, and hyphens, absent letters that are heard anyway, popular in wordplay, the words of the week - feb. 23, 10 scrabble words without any vowels, 12 more bird names that sound like insults (and sometimes are), 9 superb owl words, 'gaslighting,' 'woke,' 'democracy,' and other top lookups, games & quizzes.

Play Blossom: Solve today's spelling word game by finding as many words as you can using just 7 letters. Longer words score more points.

  • Election 2024
  • Entertainment
  • Newsletters
  • Photography
  • Surfing in Tahiti: 2024 Paris Olympics
  • Press Releases
  • Israel-Hamas War
  • Russia-Ukraine War
  • Latin America
  • Middle East
  • Asia Pacific
  • AP Top 25 College Football Poll
  • Movie reviews
  • Book reviews
  • Financial Markets
  • Business Highlights
  • Financial wellness
  • Artificial Intelligence
  • Social Media

Americans’ reliance on credit cards is the key to Capital One’s bid for Discover

Capital One credit cards are shown in Mount Prospect, Ill., Tuesday, Feb. 20, 2024. Capital One Financial is buying Discover Financial Services for $35 billion, in a deal that would bring together two of the nation’s biggest lenders and credit card issuers. (AP Photo/Nam Y. Huh)

Capital One credit cards are shown in Mount Prospect, Ill., Tuesday, Feb. 20, 2024. Capital One Financial is buying Discover Financial Services for $35 billion, in a deal that would bring together two of the nation’s biggest lenders and credit card issuers. (AP Photo/Nam Y. Huh)

File - A Discover card is used to pay for gasoline at a Sam’s Club in Madison, Miss., July 1, 2021. Capital One Financial is buying Discover Financial Services for $35 billion, in a deal that would bring together two of the nation’s biggest lenders and credit card issuers. (AP Photo/Rogelio V. Solis, File)

FILE - A branch office of Capital One Bank is pictured on May 7, 2009, in New York. Capital One Financial is buying Discover Financial Services for $35 billion, in a deal that would bring together two of the nation’s biggest lenders and credit card issuers, according to a news release issued by the companies Monday, Feb. 19, 2024. (AP Photo/Mark Lennihan, File)

FILE - The logo for Capital One Financial is displayed above a trading post on the floor of the New York Stock Exchange, July 30, 2019. Capital One Financial is buying Discover Financial Services for $35 billion, in a deal that would bring together two of the nation’s biggest lenders and credit card issuers, according to a news release issued by the companies Monday, Feb. 19, 2024. (AP Photo/Richard Drew, File)

  • Copy Link copied

NEW YORK (AP) — Americans have become increasingly reliant on their credit cards since the pandemic. So much so that Capital One is willing to bet more than $30 billion that they won’t break the habit.

Capital One Financial announced Monday that it would buy Discover Financial Services for $35 billion . The combination could potentially shake up the payments industry, which is largely dominated by Visa and Mastercard.

For customers of the companies, it might eventually mean bigger perks and more merchant acceptance of Discover cards, and potentially lead to more competition in the payments industry. But most of the benefits will be going to the companies themselves, as well as the merchants who accept these cards.

Why is the deal important?

Some of the biggest issuers of credit cards are banks, like JPMorgan Chase and Citigroup. But Capital One and Discover are first and foremost credit card companies — like American Express, but with different clientele. They have tens of millions of customers and target their products at Americans who do not travel heavily outside the U.S. and would like to get more value out of their everyday purchases like gas, groceries and domestic travel. In other words, people who typically don’t carry premium credit cards.

Capital One credit cards are shown in Mount Prospect, Ill., Tuesday, Feb. 20, 2024. Capital One Financial is buying Discover Financial Services for $35 billion, in a deal that would bring together two of the nation’s biggest lenders and credit card issuers. (AP Photo/Nam Y. Huh)

The combined company will have more loans to customers on its credit cards than JPMorgan and Citigroup combined. The merger also gives the Discover network the ability to fight on more equal footing with Mastercard and American Express in a way that it simply hasn’t been able to in its 40-year history.

“You want the customer or merchant to choose you as a company, either for your products or for your brand, and this deal gives them plenty of opportunity to make that case,” said Sanjay Sakhrani, a payments industry analyst with Keefe, Bruyette & Woods.

Who uses Capital One and Discover?

Capital One is one of the biggest credit card companies and banks in the country. It typically operates what is known in the credit card industry as a “barbell” business model — it issues credit cards to those with less-than-great credit as well as with super high credit, and little in between. The one group keeps a balance, bringing the company interest revenue, while the high-end customers spend heavily on their cards, bringing in fee revenue from merchants.

Discover’s customers are fewer but intensely loyal to the company. The company consistently wins customer service awards, and its cash-back cards are considered among the most lucrative in the industry.

But Discover suffers from a perception that because its payment network is smaller than Visa, Mastercard or AmEx, it is less desirable. Also, Discover is largely unavailable outside the U.S. as a payment option.

Capital One executives said Tuesday that they would start allowing customers to use the Discover payment network shortly after the deal closes, which could happen by the end of the year. Capital One also plans to keep the Discover brand along with its cards, although the cards could be co-branded.

What does this deal say about credit card spending?

This deal, at its core, is a big bet that Americans will keep running up their credit card balances.

Americans have been increasing their card balances quickly amid two years of high inflation. In the fourth quarter of 2023, Americans held $1.13 trillion on their credit cards, and aggregate household debt balances increased by $212 billion, up 1.2%, according to the latest data from the New York Federal Reserve.

Consumers are also paying higher interest rates on those balances. The average interest rate on a bank credit card is roughly 21.5%, the highest it’s been since the Federal Reserve started tracking the data in 1994.

Critics of Capital One have long said the company relies heavily on those who can least afford to be carrying high interest balances on their credit cards. Historically Capital One has had higher default rates and higher 30-day delinquency rates than JPMorgan, Citi, Discover and American Express.

What’s so valuable about Discover?

It’s virtually impossible to build a credit and debit card network from scratch in today’s market. Capital One executives described previous efforts to do so as a “chicken or egg” problem, where it’s hard to get merchants to sign up for a payment network when there are few customers, and vice versa.

Chicago-based Discover may be small but its infrastructure makes it poised to grow, particularly as more transactions move away from cash. The U.S. credit card industry is dominated by the Visa-Mastercard duopoly with AmEx being a distance third place and Discover an even more distant fourth place. Roughly $6.8 trillion is run on Visa’s credit and debit network compared to the only $550 billion on Discover’s network.

Owning Discover’s network would enable Capital One to get revenue from fees charged for every merchant transaction that runs on the network.

It also turns Capital One into the rare credit card company that controls the cards, the payment network and the bank that issues the card. There’s only one other company that has accomplished this to scale: American Express.

Will regulators approve the deal?

It’s unclear whether the deal will pass regulatory scrutiny. Nearly every bank issues a credit card to customers but few companies are credit card companies first, and banks second. Both Discover — which was long ago the Sears Card — and Capital One started off as credit card companies that expanded into other financial offerings like checking and savings accounts.

Bank regulators have signaled for some time that they want to give more scrutiny to large mergers in the financial services sector. The combined Discover-Capital One company will have more than $600 billion in assets, making it bigger than most large regional banks in the country.

Consumer groups are expected to put heavy pressure on the Biden Administration to make sure the deal is good for consumers as well as shareholders. Left-leaning politicians like Sen. Sherrod Brown, the powerful Democratic chair of the Senate Banking Committee, are already calling for close scrutiny of the deal.

“The deal also poses massive anti-trust concerns, given the vertical integration of Capital One’s credit card lending with Discover’s credit card network,” said Jesse Van Tol, president and CEO of the National Community Reinvestment Coalition.

meaning of assignment credit

We've detected unusual activity from your computer network

To continue, please click the box below to let us know you're not a robot.

Why did this happen?

Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. For more information you can review our Terms of Service and Cookie Policy .

For inquiries related to this message please contact our support team and provide the reference ID below.

An official website of the United States Government

  • Kreyòl ayisyen
  • Search Toggle search Search Include Historical Content - Any - No Include Historical Content - Any - No Search
  • Menu Toggle menu
  • INFORMATION FOR…
  • Individuals
  • Business & Self Employed
  • Charities and Nonprofits
  • International Taxpayers
  • Federal State and Local Governments
  • Indian Tribal Governments
  • Tax Exempt Bonds
  • FILING FOR INDIVIDUALS
  • How to File
  • When to File
  • Where to File
  • Update Your Information
  • Get Your Tax Record
  • Apply for an Employer ID Number (EIN)
  • Check Your Amended Return Status
  • Get an Identity Protection PIN (IP PIN)
  • File Your Taxes for Free
  • Bank Account (Direct Pay)
  • Debit or Credit Card
  • Payment Plan (Installment Agreement)
  • Electronic Federal Tax Payment System (EFTPS)
  • Your Online Account
  • Tax Withholding Estimator
  • Estimated Taxes
  • Where's My Refund
  • What to Expect
  • Direct Deposit
  • Reduced Refunds
  • Amend Return

Credits & Deductions

  • INFORMATION FOR...
  • Businesses & Self-Employed
  • Earned Income Credit (EITC)
  • Child Tax Credit
  • Clean Energy and Vehicle Credits
  • Standard Deduction
  • Retirement Plans

Forms & Instructions

  • POPULAR FORMS & INSTRUCTIONS
  • Form 1040 Instructions
  • Form 4506-T
  • POPULAR FOR TAX PROS
  • Form 1040-X
  • Circular 230

Tax Time Guide 2024: What to know before completing a tax return

More in news.

  • Topics in the News
  • News Releases for Frequently Asked Questions
  • Multimedia Center
  • Tax Relief in Disaster Situations
  • Inflation Reduction Act
  • Taxpayer First Act
  • Tax Scams/Consumer Alerts
  • The Tax Gap
  • Fact Sheets
  • IRS Tax Tips
  • e-News Subscriptions
  • IRS Guidance
  • Media Contacts
  • IRS Statements and Announcements

IR-2024-45, Feb. 21, 2024

WASHINGTON — During the busiest time of the tax filing season, the Internal Revenue Service kicked off its 2024 Tax Time Guide series to help remind taxpayers of key items they’ll need to file a 2023 tax return.

As part of its four-part, weekly Tax Time Guide series, the IRS continues to provide new and updated resources to help taxpayers file an accurate tax return. Taxpayers can count on IRS.gov for updated resources and tools along with a special free help page available around the clock. Taxpayers are also encouraged to read Publication 17, Your Federal Income Tax (For Individuals) for additional guidance.

Essentials to filing an accurate tax return

The deadline this tax season for filing Form 1040, U.S. Individual Income Tax Return , or 1040-SR, U.S. Tax Return for Seniors , is April 15, 2024. However, those who live in Maine or Massachusetts will have until April 17, 2024, to file due to official holidays observed in those states.

Taxpayers are advised to wait until they receive all their proper tax documents before filing their tax returns. Filing without all the necessary documents could lead to mistakes and potential delays.

It’s important for taxpayers to carefully review their documents for any inaccuracies or missing information. If any issues are found, taxpayers should contact the payer immediately to request a correction or confirm that the payer has their current mailing or email address on file.

Creating an IRS Online Account can provide taxpayers with secure access to information about their federal tax account, including payment history, tax records and other important information.

Having organized tax records can make the process of preparing a complete and accurate tax return easier and may also help taxpayers identify any overlooked deductions or credits .

Taxpayers who have an Individual Taxpayer Identification Number or ITIN may need to renew it if it has expired and is required for a U.S. federal tax return. If an expiring or expired ITIN is not renewed, the IRS can still accept the tax return, but it may result in processing delays or delays in credits owed.

Changes to credits and deductions for tax year 2023

Standard deduction amount increased. For 2023, the standard deduction amount has been increased for all filers. The amounts are:

  • Single or married filing separately — $13,850.
  • Head of household — $20,800.
  • Married filing jointly or qualifying surviving spouse — $27,700.

Additional child tax credit amount increased. The maximum additional child tax credit amount has increased to $1,600 for each qualifying child.

Child tax credit enhancements. Many changes to the Child tax credit (CTC) that had been implemented by the American Rescue Plan Act of 2021 have expired.

However, the IRS continues to closely monitor legislation being considered by Congress affecting the Child Tax Credit. The IRS reminds taxpayers eligible for the Child Tax Credit that they should not wait to file their 2023 tax return this filing season. If Congress changes the CTC guidelines, the IRS will automatically make adjustments for those who have already filed so no additional action will be needed by those eligible taxpayers.

Under current law, for tax year 2023, the following currently apply:

  • The enhanced credit allowed for qualifying children under age 6 and children under age 18 has expired. For 2023, the initial amount of the CTC is $2,000 for each qualifying child. The credit amount begins to phase out where AGI income exceeds $200,000 ($400,000 in the case of a joint return). The amount of the CTC that can be claimed as a refundable credit is limited as it was in 2020 except that the maximum ACTC amount for each qualifying child increased to $1,500.
  • The increased age allowance for a qualifying child has expired. A child must be under age 17 at the end of 2023 to be a qualifying child.

Changes to the Earned Income Tax Credit (EITC). The enhancements for taxpayers without a qualifying child implemented by the American Rescue Plan Act of 2021 will not apply for tax year 2023. To claim the EITC without a qualifying child in 2023, taxpayers must be at least age 25 but under age 65 at the end of 2023. If a taxpayer is married filing a joint return, one spouse must be at least age 25 but under age 65 at the end of 2023.

Taxpayers may find more information on Child tax credits in the Instructions for Schedule 8812 (Form 1040) .

New Clean Vehicle Credit. The credit for new qualified plug-in electric drive motor vehicles has changed. This credit is now known as the Clean Vehicle Credit. The maximum amount of the credit and some of the requirements to claim the credit have changed. The credit is reported on Form 8936, Qualified Plug-In Electric Drive Motor Vehicle Credit , and on Form 1040, Schedule 3.

More information on these and other credit and deduction changes for tax year 2023 may be found in the Publication 17, Your Federal Income Tax (For Individuals) , taxpayer guide.

1099-K reporting requirements have not changed for tax year 2023

Following feedback from taxpayers, tax professionals and payment processors, and to reduce taxpayer confusion, the IRS recently released Notice 2023-74 announcing a delay of the new $600 reporting threshold for tax year 2023 on Form 1099-K, Payment Card and Third-Party Network Transactions . The previous reporting thresholds will remain in place for 2023.

The IRS has published a fact sheet with further information to assist taxpayers concerning changes to 1099-K reporting requirements for tax year 2023.

Form 1099-K reporting requirements

Taxpayers who take direct payment by credit, debit or gift cards for selling goods or providing services by customers or clients should get a Form 1099-K from their payment processor or payment settlement entity no matter how many payments they got or how much they were for.

If they used a payment app or online marketplace and received over $20,000 from over 200 transactions,

the payment app or online marketplace is required to send a Form 1099-K. However, they can send a Form 1099-K with lower amounts. Whether or not the taxpayer receives a Form 1099-K, they must still report any income on their tax return.

What’s taxable? It’s the profit from these activities that’s taxable income. The Form 1099-K shows the gross or total amount of payments received. Taxpayers can use it and other records to figure out the actual taxes they owe on any profits. Remember that all income, no matter the amount, is taxable unless the tax law says it isn’t – even if taxpayers don’t get a Form 1099-K.

What’s not taxable? Taxpayers shouldn’t receive a Form 1099-K for personal payments, including money received as a gift and for repayment of shared expenses. That money isn’t taxable. To prevent getting an inaccurate Form 1099-K, note those payments as “personal,” if possible.

Good recordkeeping is key. Be sure to keep good records because it helps when it’s time to file a tax return. It’s a good idea to keep business and personal transactions separate to make it easier to figure out what a taxpayer owes.

For details on what to do if a taxpayer gets a Form 1099-K in error or the information on their form is incorrect, visit IRS.gov/1099k  or find frequently asked questions at Form 1099-K FAQs .

Direct File pilot program provides a new option this year for some

The IRS launched the Direct File pilot program during the 2024 tax season. The pilot will give eligible taxpayers an option to prepare and electronically file their 2023 tax returns, for free, directly with the IRS.

The Direct File pilot program will be offered to eligible taxpayers in 12 pilot states who have relatively simple tax returns reporting only certain types of income and claiming limited credits and deductions. The 12 states currently participating in the Direct File pilot program are Arizona, California, Florida, Massachusetts, Nevada, New Hampshire, New York, South Dakota, Tennessee, Texas, Washington state and Wyoming. Taxpayers can check their eligibility at directfile.irs.gov .

The Direct File pilot is currently in the internal testing phase and will be more widely available in mid-March. Taxpayers can get the latest news about the pilot at Direct File pilot news and sign up to be notified when Direct File is open to new users.

Finally, for comprehensive information on all these and other changes for tax year 2023, taxpayers and tax professionals are encouraged to read the Publication 17, Your Federal Income Tax (For Individuals) , taxpayer guide, as well as visit other topics of taxpayer interest on IRS.gov.

  •  Facebook
  •  Twitter
  •  Linkedin

Capital One to acquire Discover for $35 billion, merging 2 of the US' largest credit-card companies

  • Capital One is set to acquire Discover Financial Services in a $35.3 billion all-stock deal.
  • The deal would merge two of the largest credit-card issuers in the US.
  • Discover is coming off a difficult year after compliance lapses led to a CEO resignation.

Insider Today

Capital One Financial is set to acquire Discover Financial Services in an all-stock deal valued at $35.3 billion, the two companies announced on Monday.

The deal — first reported by the The Wall Street Journal — would merge two of the largest credit-card issuers in the US.

"Our acquisition of Discover is a singular opportunity to bring together two very successful companies with complementary capabilities and franchises, and to build a payments network that can compete with the largest payments networks and payments companies," Richard Fairbank, Capital One's founder, chairman and CEO said in the statement.

Under the deal, Discover shareholders will receive 1.0192 Capital One share for each Discover share. This represents a 26.6% premium over Discover's closing price of $110.49 on Friday.

The deal is expected to close in late 2024 or early 2025.

Warren Buffett -backed Capital One, a bank that issues credit cards, serves more than 100 million customers, according to its website.

Meanwhile, Discover is coming off of a difficult year in which compliance lapses found in an internal review led to the resignation of its CEO. The credit card company reported a 62% drop in Q4 profits and currently carries a market value of around $27.63 billion, with shares down nearly 2% for the year.

An acquisition of this size could signal a better year ahead for a long-anticipated rebound in mergers and acquisitions . Global M&A fell to a 10-year low last year, marking a drag on the investment banks that make money helping boards and investors buy and sell companies.

Capital One's shares closed 0.6% higher at $137.23 apiece on Friday. They are up nearly 5% in 2024, reaching a market cap of $52.2 billion.

February 19, 9:23 p.m. ET: This story has been updated with details of the deal.

meaning of assignment credit

Watch: Amazon just bought Whole Foods for nearly $14 billion — here's what the future of shopping could look like

meaning of assignment credit

  • Main content

IMAGES

  1. Letter of Credit

    meaning of assignment credit

  2. What is the meaning of Credit in Accounting

    meaning of assignment credit

  3. Assignment of Credit

    meaning of assignment credit

  4. Assignment. Meaning, types, importance, and good characteristics of assignment

    meaning of assignment credit

  5. Assignment

    meaning of assignment credit

  6. Deed of Assignment Template

    meaning of assignment credit

VIDEO

  1. write application to the principal of your college requesting him for some financial help//financial

  2. Choose the correct statement

  3. 07 Closing Statement

  4. He understood the assignment CREDIT doyoufeelthebreeze #Shorts

  5. S.55 Conversion of Endorsement in Blank to Full

  6. Major Assignment 2 D

COMMENTS

  1. Debt Assignment: How They Work, Considerations and Benefits

    Debt assignment is a transfer of debt, and all the associated rights and obligations, from a creditor to a third party (often a debt collector). The company assigning the debt may do so to...

  2. What Is the Credit Assignment Problem?

    The credit assignment problem refers to the problem of measuring the influence and impact of an action taken by an agent on future rewards. The core aim is to guide the agents to take corrective actions which can maximize the reward.

  3. Assignment of Credits (FDCR.A.10.020)

    Those institutions seeking, or participating in, Title IV federal financial aid, shall demonstrate that they have policies determining the credit hours awarded to courses and programs in keeping with commonly-accepted practices in higher education and with any federal definition of the credit hour, as may appear in federal regulations and that i...

  4. Credit: What It Is and How It Works

    The word "credit" has many meanings in the financial world, but it most commonly refers to a contractual agreement in which a borrower receives a sum of money or something else of value and...

  5. reinforcement learning

    The (temporal) credit assignment problem (CAP) (discussed in Steps Toward Artificial Intelligence by Marvin Minsky in 1961) is the problem of determining the actions that lead to a certain outcome. For example, in football, at each second, each football player takes an action.

  6. neural networks

    The function that computes the value (s) used to update the weights. How this value is used is the training algorithm but the credit assignment is the function that processes the weights (and perhaps something else) to that will later be used to update the weights.

  7. Credit Assignment

    Temporal credit assignment refers to the assignment of credit for outcomes to actions. Structural credit assignment refers to the assignment of credit for actions to internal decisions.

  8. Letter of Credit Transfer and Assignment of Letter of Credit ...

    An assignment of letter of credit proceeds is an assignment (or transfer) of future debt payable under a letter of credit from the beneficiary to another person (ie, the assignee). It...

  9. The Credit Assignment Problem

    So, credit assignment is the problem of turning feedback into strategy improvements. Michigan-style systems tried to do this locally, meaning, individual itty-bitty pieces got positive/negative credit, which influenced their ability to participate, thus adjusting the strategy.

  10. Legal Due Diligence: Beware of Credit Assignments

    The assignment contracts. Credit assignment is the contract by which the creditor (assignor) transfers its right to credit to a third party (assignee), who will collect it from the debtor (assigned). This is an increasingly common practice, involving not only banking institutions but also commercial companies that, in turn, also assign ...

  11. Notice of Assignment: Debt Terms explained

    A Notice of Assignment, in relation to debt, is a document used to inform debtors that their debt has been 'purchased' by a third party. The notice serves to notify the debtor that a new company (known as the assignee) has taken over the responsibility of collecting the debt. This means that the debtor should direct their future payments ...

  12. Assignment under Documentary Credits

    Although not widely used, an assignment of proceeds is a means of settling a debt or ensuring that payment is made to the correct or appropriate entity under a documentary credit or standby letter of credit. An assignment of proceeds can occur in a number of circumstances, including: between banks.

  13. Number of Credits for Each Class

    Credit assignments may also consider the intensity of engagement with the faculty or subject matter, student responsibility for learning outcomes, and course-related learning taking place outside the classroom, including online. ... and complying with federal and state regulations. Thus, the definition of a credit hour and the assignment of ...

  14. PDF Credit

    Lesson Overview This lesson provides an easy-to-understand introduction to credit, how it can benefit participants and the risks they should watch out for. Participants will learn the differences between good and bad credit, how to build credit and the "five C's"—how lenders evaluate credit worthiness. Learning Objectives

  15. assignment for benefit of creditors

    Assignment for the benefit of the creditors (ABC)(also known as general assignment for the benefit of the creditors) is a voluntary alternative to formal bankruptcy proceedings that transfers all of the assets from a debtor to a trust for liquidating and distributing its assets. The trustee will manage the assets to pay off debt to creditors, and if any assets are left over, they will be ...

  16. Assignment (law)

    Assignment [1] is a legal term used in the context of the laws of contract and of property. In both instances, assignment is the process whereby a person, the assignor, transfers rights or benefits to another, the assignee. [2] An assignment may not transfer a duty, burden or detriment without the express agreement of the assignee.

  17. G.R. No. 84220

    An assignment of credit, on the other hand, is the process of transferring the right of the assignor to the assignee who would then have the right to proceed against the debtor. The assignment may be done either gratuitously or onerously, in which case, the assignment has an effect similar to that of a sale (p. 235, Civil Code of the ...

  18. Credit Assignment Definition

    Credit Assignment. definition. Credit Assignment or "Credit Assignments" It indicates credit assignments pursuant to article 15.4 of this Agreement with deed to be agreed between the Parties, the form and content of which must meet with the approval of the Agent Bank.

  19. What is Credit-Assignment

    Definition of Credit-Assignment: it is the process of identifying among the set of actions chosen in an episode the ones which are responsible for the final outcome. And moreover, it is an attempt to identify the best, and worst, decisions chosen during an episode, so that the best decisions are reinforced and the worst penalized.

  20. What is 'DFA' in baseball? What to know about abbreviation's meaning

    What does DFA mean in baseball? Designated for assignment. It's one of the more unique transaction types in baseball, where unlike being traded, the player is optioned to the minor leagues or ...

  21. Assignment Definition & Meaning

    The meaning of ASSIGNMENT is the act of assigning something. How to use assignment in a sentence. Synonym Discussion of Assignment.

  22. Americans' reliance on credit cards key to Capitol One, Discover deal

    FILE - The logo for Capital One Financial is displayed above a trading post on the floor of the New York Stock Exchange, July 30, 2019. Capital One Financial is buying Discover Financial Services for $35 billion, in a deal that would bring together two of the nation's biggest lenders and credit card issuers, according to a news release issued by the companies Monday, Feb. 19, 2024.

  23. Americans have racked up a trillion dollars in credit card debt ...

    Here's what going on with credit cards: Credit card debt hit a fresh nominal high of $1.13 trillion from October through December, according to the Federal Reserve Bank of New York.

  24. Capital One To Buy Discover: What Will Deal Mean For Customers? (COF

    Capital One Financial Corp., the US lender backed by Warren Buffett, is set to buy Discover Financial Services in a $35 billion deal that would bring together two of the biggest credit card firms ...

  25. Tax Time Guide 2024: What to know before completing a tax return

    Changes to the Earned Income Tax Credit (EITC). The enhancements for taxpayers without a qualifying child implemented by the American Rescue Plan Act of 2021 will not apply for tax year 2023. To claim the EITC without a qualifying child in 2023, taxpayers must be at least age 25 but under age 65 at the end of 2023. If a taxpayer is married ...

  26. America is on the cusp of a new biggest credit card company. Here's

    Consumers could pay higher interest rates As of 2022, around 42% of Capital One's credit cards ran over Visa's network and 58% ran over Mastercard's, Bank of America analysts said in a ...

  27. Capital One Set to Acquire Discover in $35 Billion Stock Deal

    The credit card company reported a 62% drop in Q4 profits and currently carries a market value of around $27.63 billion, with shares down nearly 2% for the year.