Chapter 25 — Virtual Functions: MINI-LLVM
You're building MINI-LLVM: a tiny instruction hierarchy that mirrors the
real llvm::Instruction class family you'll work with in CS6340 labs. By the
end of the exercise you'll have an abstract base class Inst with three
concrete derived classes (AddInst, LoadInst, BranchInst), a polymorphic
free function describe(), a safe downcast helper asBranch(), and a
countOpcode() function that walks a vector of owned instructions — all
without importing a single LLVM header.
This exercise resolves the Chapter 24 cliffhanger: in Ch 24 you saw that
calling a base-class function through a Base& always invokes the base
version, even when the object is actually a derived type (static binding). The
describe() function here is the exact same pattern — one function that accepts
const Inst& — but now opcodeName() is virtual, so the call dispatches to
whichever concrete class is actually there. That is runtime polymorphism,
and it is the mechanism LLVM relies on for every pass that iterates over
instructions.
The LLVM vocabulary map for this exercise (see also ../../llvm-idioms.md):
| This exercise | Real LLVM API |
|---|---|
Inst | llvm::Instruction |
Inst::opcodeName() | Instruction::getOpcodeName() |
AddInst | llvm::BinaryOperator (Add opcode) |
LoadInst | llvm::LoadInst |
BranchInst | llvm::BranchInst |
asBranch(p) | dyn_cast<BranchInst>(&I) |
asBranch(p) != nullptr | isa<BranchInst>(&I) (the boolean-only cousin) |
countOpcode(module, "add") | a Module → BasicBlock → Instruction walk |
dynamic_cast here is standard C++; LLVM's isa<> / cast<> / dyn_cast<>
are its own faster, RTTI-free helpers (see ../../llvm-idioms.md), but the
reasoning is identical: ask whether the runtime object is a given derived type,
then either branch on a bool (isa) or use the downcast pointer (dyn_cast).
Your tasks
Virtual base destructor (
~Inst()). Define the body ofInst::~Inst(). It may be empty — the important thing is that it exists and is declaredvirtualin the header. Without a virtual destructor, deleting a derived object through anInst*skips the derived destructor entirely.opcodeName()overrides. Implement all threeopcodeName()bodies:AddInstreturns"add",LoadInstreturns"load",BranchInstreturns"br". These three strings make virtual dispatch observable: the same callinst->opcodeName()through anInst*produces a different string depending on which concrete class lives at the other end of the pointer.Derived destructors — the vtable proof. Each body does one thing:
++Inst::s_destroyed. When the grader deletes anAddInstthrough astd::unique_ptr<Inst>, it checks thats_destroyedticked up — direct proof that the virtual destructor chain ran the derived destructor before the base one.describe(const Inst& inst). Return"inst: " + inst.opcodeName(). This is the Chapter 24 cliffhanger resolved: the static type ofinstisconst Inst&, butopcodeName()isvirtual, so the call dispatches at runtime to whichever concrete classinstactually is. You write this function once; it works for any present or futureInstsubclass.asBranch(Inst* inst). Usedynamic_cast<BranchInst*>(inst)and return the result directly. Returns a validBranchInst*when the pointed-at object is aBranchInst; returnsnullptrotherwise (including fornullptrinput). This is LLVM'sdyn_cast<BranchInst>(&I)in miniature. Always use the pointer form — the reference form throwsstd::bad_caston failure instead.countOpcode(...). Iterate over thestd::vector<std::unique_ptr<Inst>>with a range-for. For each element, callopcodeName()on the dereferenced pointer and count how many match the requestedopcodestring. Return the count.
Success criteria
opcodeName()called directly on concrete objects (baseline sanity)opcodeName()called through anInst*(pure virtual dispatch)- Virtual destructor proof:
s_destroyedincrements when a derived object is deleted through aunique_ptr<Inst>base pointer describe()through concrete objects and through base referencesasBranch()succeeding on aBranchInst, failing (nullptr) on others, and handlingnullptrinputcountOpcode()on empty, single, mixed, and all-same vectors- An integrated round-trip: build a 4-instruction basic block, call
describe(),countOpcode(), andasBranch()together
Concepts practiced
virtualfunctions — enable dynamic dispatch through a base pointer or reference (notes 25.2)override— compile-time verification that a derived function matches a base virtual (notes 25.3)final— locksBranchInst::opcodeName()against further overriding (notes 25.3)- Pure virtual / abstract base classes —
= 0onopcodeName()makesInstnon-instantiable (notes 25.7) - Virtual destructors — required whenever a class is deleted through a base pointer; the destroyed-counter proves it fired (notes 25.4)
- Object slicing — the README demonstrates value-copy slicing (with a concrete base, since abstract
Instcannot be stored by value at all) and whystd::vector<std::unique_ptr<Inst>>is the fix for a polymorphic collection (notes 25.9) dynamic_cast(pointer form) — runtime-checked downcast returningnullptron mismatch, the safe pattern used by LLVM's owndyn_cast<>(notes 25.10)- Reused from earlier chapters:
std::vector(Ch 16),std::unique_ptr(Ch 22 — provided scaffolding),std::string_view(Ch 5), header/source split (Ch 2)
Constraints
- Allowed (Ch 25):
virtual,override,final,= 0(pure virtual), virtual destructors,dynamic_cast(pointer form only). - Allowed (earlier chapters):
std::string,std::string_view,std::vector,std::unique_ptr,std::make_unique, range-for,constreferences. - Forbidden:
typeid/type_infoRTTI,reinterpret_cast, rawnew/deletein learner code (use the providedunique_ptrscaffolding),static_castdown the hierarchy (not safe — usedynamic_cast), virtual inheritance (Ch 25.8 — out of scope for this exercise). - Required idioms:
overrideon every derived virtual override; virtual destructor onInst; pointer form ofdynamic_castinasBranch();std::vector<std::unique_ptr<Inst>>for polymorphic ownership.
Build & run locally
make # compile starter/inst.cpp warning-clean (object file only)
make test # grade your code -> RED until the TASK blocks are filled in
make solution # run the grader against the reference solution
make test-solution # proof the exercise is solvable: reference MUST be green
make clean # remove build artifactsThe grader compiles tests/tests.cpp together with your starter/inst.cpp and
runs all checks. Turning 32 red lines green is the exercise.
Hints
Task 1 — why the virtual destructor body can be empty
Inst carries no raw resources (no new-allocated memory, no file handles).
The body of ~Inst() genuinely does not need to do anything. What matters is:
- The declaration in
inst.hsaysvirtual ~Inst();— making it virtual. - There is a definition somewhere (your
inst.cpp) — evenInst::~Inst() = default;(which lets the compiler generate an empty body) is fine.
The derived destructors (~AddInst, etc.) do the interesting work in TASK 3.
Task 2 — what "override" actually checks
override tells the compiler:
This function is intended to override a virtual base function. If it does not match one exactly (return type, name, parameters, const), make it a compile error.
Without override, a typo like std::string opcodeNAME() const silently
creates a brand-new function instead of overriding the base virtual. Because
Inst::opcodeName() here is pure (= 0, no body to fall back on), the
derived class would then leave that pure virtual un-overridden — so it stays
abstract and the test's AddInst add {} fails to compile outright ("variable
type 'AddInst' is an abstract class"). (For a non-pure base virtual, the typo
would instead silently hide the base version and dispatch would keep hitting
the base body.) With override, the compiler catches the mismatch at the
declaration, before any of that. Always use it on every derived override.
(notes 25.3)
Task 3 — the virtual destructor chain in detail
When std::unique_ptr<Inst> goes out of scope holding an AddInst*, this is
what happens:
unique_ptr calls: delete (Inst*)(add_ptr)
|
v
Virtual dispatch: vtable for AddInst -> slot for ~AddInst
|
v
AddInst::~AddInst() runs -> ++Inst::s_destroyed
|
v (automatic base-chain)
Inst::~Inst() runs -> (empty body)If Inst::~Inst() were not virtual, the vtable lookup never happens and the
compiler calls Inst::~Inst() directly — AddInst::~AddInst() is skipped
entirely, s_destroyed stays 0, and any AddInst-specific resources would be
leaked. That is the bug the virtual destructor rule prevents. (notes 25.4)
Task 4 — seeing the dispatch in action
std::string describe(const Inst& inst)
{
return "inst: " + inst.opcodeName();
// ^^^^^^^^^^^^^^^^^
// static type: const Inst&
// dynamic type: AddInst, LoadInst, or BranchInst (at runtime)
//
// Because opcodeName() is virtual, the call goes through the vtable
// of the ACTUAL object, not the static type. This is late binding.
}Chapter 24 had this same pattern with a NON-virtual function: every call
through const Base& hit Base::identify() regardless of the actual object.
Adding virtual is the one-word fix.
Task 5 — dynamic_cast pointer vs reference form
Pointer form (use this):
BranchInst* b = dynamic_cast<BranchInst*>(inst);
// b == nullptr if inst does not point at a BranchInst
// b is valid if inst doesReference form (avoid in asBranch):
BranchInst& b = dynamic_cast<BranchInst&>(*inst);
// throws std::bad_cast if *inst is not a BranchInst
// need a try/catch to handle failure — more code, harder to useLLVM's own dyn_cast<T>(ptr) uses the pointer form and returns nullptr on
failure — exactly the pattern you're implementing here. (notes 25.10,
llvm-idioms.md)
Task 6 — the range-for loop shape
int count { 0 };
for (const auto& ptr : module) // ptr is const unique_ptr<Inst>&
{
// ptr->opcodeName() dereferences unique_ptr and calls virtual opcodeName()
if (ptr->opcodeName() == opcode)
++count;
}
return count;ptr->opcodeName() is the same as (*ptr).opcodeName(). The -> operator on
unique_ptr forwards to the underlying raw pointer. Virtual dispatch still
works — the vtable pointer lives in the concrete object on the heap, not in the
unique_ptr wrapper.
Stuck on a build error instead of a test failure?
- "allocating an object of abstract class type 'Inst'" — you cannot create an
Instobject directly; only derived concrete classes. If a test triesInst x {}, that is the bug — but the provided tests never do this. - "override" compile error — the signature in your derived definition does not
exactly match the base virtual declaration. Check
const, parameter types, and return type carefully. Theoverridekeyword is catching a real mismatch. - "undefined reference to Inst::s_destroyed" — the static member needs a
definition in exactly one
.cpp. It is already provided in the starter; do not remove or duplicate it. - dynamic_cast returns garbage instead of nullptr —
dynamic_castrequires the class to be polymorphic (have at least one virtual function).Instqualifies; just make sure you are using the pointer form, not a C-style cast.
Stretch goals
- Add a
RetInstclass that overridesopcodeName()to return"ret"and marks both the override and the class itself asfinal— experiment with whatfinalallows and forbids. - Add a
virtual void print(std::ostream&) consttoInstand a non-memberoperator<<that calls it — the polymorphic printing pattern from notes 25.11. - Make
describe()return the result of a second virtual method, sayvirtual std::string operands() const, that each derived class implements differently — explore how deep the dispatch chain can go. - Replace
countOpcodewith afindFirst(module, opcode) -> Inst*that returns a pointer to the first matching instruction, ornullptrif none — combine range-forwithdynamic_castto also downcast the result. - Add exception handling (Ch 27 preview): wrap the reference form of
dynamic_castin atry/catch(std::bad_cast&)and compare the error path against the nullable pointer form.
// Chapter 25 — Virtual Functions · MINI-LLVM: inst.cpp (STARTER)
// ─────────────────────────────────────────────────────────────────────────────
// Fill in the six TASK blocks below. Each maps 1:1 to a task in the README
// and to a declaration in ../inst.h. The bodies currently return WRONG
// PLACEHOLDERS so the file compiles immediately — that is why `make test` is
// RED right now. Your job is to turn it GREEN.
//
// make build compile your code (should already work)
// make test grade it (RED until you fill these in)
// make solution run the grader against the reference if stuck
//
// KEY VOCABULARY (notes 25.2, 25.3, 25.7):
// virtual — enables dynamic dispatch through a base pointer/reference
// override — verifies the function actually overrides a base virtual
// final — prevents further overriding (on a function) or inheritance
// = 0 — pure virtual; makes the class ABSTRACT (non-instantiable)
// virtual ~ — virtual destructor; required when deleting through base ptr
// ─────────────────────────────────────────────────────────────────────────────
#include "../inst.h"
#include <string>
// ── Provided: definition of the static counter (do not change this) ──────────
// This counter lets the grader verify that deleting through a base pointer
// really calls the DERIVED destructor. Each derived ~Dtor should ++s_destroyed.
int Inst::s_destroyed = 0;
// ─────────────────────────────────────────────────────────────────────────────
// TASK 1: Virtual base-class destructor
// ─────────────────────────────────────────────────────────────────────────────
// Define the body of Inst::~Inst(). It does not need to increment s_destroyed
// (the DERIVED destructors do that in TASK 3). The body may be empty; what
// matters is that it EXISTS and is VIRTUAL (already declared virtual in inst.h).
//
// WHY: If ~Inst() were not virtual, 'delete base_ptr' would skip the derived
// destructor, leaving derived-only resources leaked and s_destroyed un-bumped.
// The virtual destructor is the FIRST thing to add to any polymorphic base.
// (notes 25.4)
//
// >>> YOUR CODE HERE <<<
//
Inst::~Inst()
{
// placeholder — body intentionally empty (correct behavior: do nothing extra)
}
// ─────────────────────────────────────────────────────────────────────────────
// ─────────────────────────────────────────────────────────────────────────────
// TASK 2: Override opcodeName() in each derived class
// ─────────────────────────────────────────────────────────────────────────────
// Each derived class must return its own opcode string so that virtual dispatch
// through an Inst& or Inst* picks the RIGHT concrete name at runtime.
//
// AddInst::opcodeName() must return "add" (notes 25.2 — opcode dispatch)
// LoadInst::opcodeName() must return "load"
// BranchInst::opcodeName() must return "br"
//
// The 'override' specifier (already on the declarations in inst.h) makes a
// signature mismatch a compile error — your best friend against typos. (25.3)
//
// >>> YOUR CODE HERE <<<
//
std::string AddInst::opcodeName() const
{
return "???"; // placeholder — wrong; replace with the real opcode string
}
std::string LoadInst::opcodeName() const
{
return "???"; // placeholder
}
std::string BranchInst::opcodeName() const
{
return "???"; // placeholder
}
// ─────────────────────────────────────────────────────────────────────────────
// ─────────────────────────────────────────────────────────────────────────────
// TASK 3: Derived destructors — count to prove the virtual dtor fires
// ─────────────────────────────────────────────────────────────────────────────
// Define ~AddInst(), ~LoadInst(), and ~BranchInst(). Each body must execute:
//
// ++Inst::s_destroyed;
//
// That is the ONLY thing needed in each body. When the grader deletes an
// AddInst through a std::unique_ptr<Inst> (a base pointer), it checks that
// s_destroyed ticked upward — proof that the virtual destructor chain ran
// AddInst::~AddInst() before Inst::~Inst(). Without 'virtual ~Inst()', only
// Inst::~Inst() would run and the counter would stay at zero. (notes 25.4)
//
// >>> YOUR CODE HERE <<<
//
AddInst::~AddInst()
{
// placeholder — does NOT increment s_destroyed (tests will fail)
}
LoadInst::~LoadInst()
{
// placeholder — does NOT increment s_destroyed (tests will fail)
}
BranchInst::~BranchInst()
{
// placeholder — does NOT increment s_destroyed (tests will fail)
}
// ─────────────────────────────────────────────────────────────────────────────
// ─────────────────────────────────────────────────────────────────────────────
// TASK 4: describe() — the Ch24 cliffhanger RESOLVED
// ─────────────────────────────────────────────────────────────────────────────
// Return a string of the form "inst: <opcodeName>", e.g. "inst: add".
//
// The call inst.opcodeName() inside this function uses VIRTUAL LATE BINDING:
// even though the parameter type is 'const Inst&' (a base reference), the call
// dispatches to the most-derived override at runtime. This is the solution to
// the Chapter 24 problem: in Ch24 all calls through a Base& hit the base
// version; adding 'virtual' sends each call to the right derived version.
// (notes 25.2)
//
// Hint: "inst: " + inst.opcodeName() gives you the right string.
//
// >>> YOUR CODE HERE <<<
//
std::string describe(const Inst& /*inst*/)
{
return "inst: ???"; // placeholder — real result uses virtual dispatch
}
// ─────────────────────────────────────────────────────────────────────────────
// ─────────────────────────────────────────────────────────────────────────────
// TASK 5: asBranch() — safe dynamic downcast (LLVM's dyn_cast<> in miniature)
// ─────────────────────────────────────────────────────────────────────────────
// Use dynamic_cast<BranchInst*>(inst) to attempt a downcast.
// • If 'inst' actually points at a BranchInst, the cast succeeds and you get
// a valid BranchInst*.
// • If 'inst' points at something else (AddInst, LoadInst, nullptr), the cast
// returns nullptr — safe, no exception.
//
// IMPORTANT: always use the POINTER form of dynamic_cast here (not the
// reference form); the reference form throws std::bad_cast on failure, which is
// harder to handle and NOT what LLVM's dyn_cast does. (notes 25.10)
//
// In real LLVM code you would write:
// if (auto *BI = dyn_cast<BranchInst>(&I)) { ... }
// This function is that pattern packaged as a helper. (llvm-idioms.md)
//
// >>> YOUR CODE HERE <<<
//
BranchInst* asBranch(Inst* /*inst*/)
{
return nullptr; // placeholder — always returns null (even for real BranchInsts)
}
// ─────────────────────────────────────────────────────────────────────────────
// ─────────────────────────────────────────────────────────────────────────────
// TASK 6: countOpcode() — walk a polymorphic instruction list
// ─────────────────────────────────────────────────────────────────────────────
// Iterate over every element of 'module' (a vector of unique_ptr<Inst>).
// For each element, dereference the unique_ptr to get an Inst&, then call
// opcodeName() on it. Compare the result to 'opcode' (std::string_view
// compares naturally with ==). Count and return how many match.
//
// Key insight — WHY unique_ptr<Inst> instead of a plain Inst value?
// If you stored Inst objects BY VALUE (std::vector<Inst>) and pushed an
// AddInst, the derived part would be SLICED OFF, turning it into a plain Inst
// with no override. Storing unique_ptr<Inst> keeps the full derived object
// alive; the pointer knows nothing about the concrete type but the vtable
// still works. This is the polymorphic-ownership idiom from notes 25.9.
//
// In real LLVM, this looks like:
// for (auto &I : BB) { if (I.getOpcodeName() == "add") ++count; }
// (llvm-idioms.md "Walk every Instruction in a BasicBlock")
//
// >>> YOUR CODE HERE <<<
//
int countOpcode(const std::vector<std::unique_ptr<Inst>>& /*module*/,
std::string_view /*opcode*/)
{
return 0; // placeholder — always returns 0 (tests will fail for non-zero counts)
}
// ─────────────────────────────────────────────────────────────────────────────
Try the lab first — the learning is in the attempt.
// Chapter 25 — Virtual Functions · MINI-LLVM: inst.cpp (REFERENCE SOLUTION)
// ─────────────────────────────────────────────────────────────────────────────
// Complete, correct, warning-clean implementation of ../inst.h.
// Peek only after taking a real swing at starter/inst.cpp — the learning is in
// writing the virtual dispatch yourself, then comparing.
//
// Design summary: every task here exercises ONE new concept from Ch 25:
// Task 1 — virtual destructor (notes 25.4)
// Task 2 — virtual/pure-virtual override (notes 25.2, 25.7)
// Task 3 — derived destructor counter (proves vtable-through-base-ptr)
// Task 4 — free-function polymorphic dispatch (notes 25.2) ← Ch24 cliff resolved
// Task 5 — dynamic_cast pointer form (notes 25.10)
// Task 6 — polymorphic vector walk; unique_ptr avoids slicing (notes 25.9)
// ─────────────────────────────────────────────────────────────────────────────
#include "../inst.h"
#include <string>
// ── Provided: static counter definition ─────────────────────────────────────
// Every translation unit that includes inst.h sees the DECLARATION of
// s_destroyed; this is the one-and-only DEFINITION (Ch 7 linkage rule).
int Inst::s_destroyed = 0;
// ─── TASK 1: Virtual base-class destructor ───────────────────────────────────
// The body is intentionally empty — Inst carries no raw resources. What
// matters is that it EXISTS and is VIRTUAL (already declared 'virtual' in the
// header).
//
// MECHANISM: When you write 'delete base_ptr' (or a unique_ptr<Inst> goes
// out of scope), the compiler does NOT call Inst::~Inst() directly. Instead,
// it follows the vtable pointer stored in the object, which leads to the
// DERIVED destructor (AddInst::~AddInst, etc.). That derived destructor runs,
// then chains up to Inst::~Inst() automatically — exactly as intended.
//
// Without 'virtual' on ~Inst(), the vtable lookup never happens: the compiler
// calls Inst::~Inst() directly, skipping the derived body entirely. That is
// the root cause of the classic "virtual destructor" bug. (notes 25.4)
Inst::~Inst() = default; // empty body; let the compiler generate it cleanly
// ─── TASK 2: opcodeName() overrides ─────────────────────────────────────────
// Each concrete class returns its own opcode string. The 'override' specifier
// in the header (and here, via the out-of-class definition) silently verifies
// the signature matches the base virtual — any typo (missing const, different
// return type) becomes a compile error rather than a silent shadow function.
// (notes 25.3)
std::string AddInst::opcodeName() const
{
return "add"; // real LLVM: Instruction::getOpcodeName() on a BinaryOperator
}
std::string LoadInst::opcodeName() const
{
return "load"; // real LLVM: llvm::LoadInst → getOpcodeName() = "load"
}
// BranchInst::opcodeName is marked 'final' in the header: no class that
// inherits from BranchInst can override opcodeName again — the hierarchy stops
// here. Use 'final' as a design signal: "this version is definitive." (25.3)
std::string BranchInst::opcodeName() const
{
return "br"; // real LLVM: llvm::BranchInst → getOpcodeName() = "br"
}
// ─── TASK 3: Derived destructors — the vtable-through-base-ptr proof ─────────
// Each body does ONE thing: ++Inst::s_destroyed.
//
// When the grader holds a std::unique_ptr<Inst> pointing at (say) an AddInst
// and lets it go out of scope, the sequence is:
//
// 1. unique_ptr calls delete ptr through an Inst*.
// 2. Virtual dispatch: vtable points at AddInst::~AddInst().
// 3. AddInst::~AddInst() runs -> s_destroyed++.
// 4. Base destructor chain: Inst::~Inst() runs automatically.
//
// If ~Inst() were NOT virtual, step 2 would be skipped and only Inst::~Inst()
// would run — s_destroyed would stay 0 and the grader would fail TASK 3.
// That is exactly why the check is here: to OBSERVE the virtual dtor, not just
// declare it and hope. (notes 25.4)
AddInst::~AddInst()
{
++Inst::s_destroyed;
}
LoadInst::~LoadInst()
{
++Inst::s_destroyed;
}
BranchInst::~BranchInst()
{
++Inst::s_destroyed;
}
// ─── TASK 4: describe() — runtime polymorphism through a base reference ───────
// This is the Chapter 24 cliffhanger RESOLVED.
//
// In Ch 24, functions that accept a Base& and call a non-virtual method always
// invoke Base's version — no matter what the actual object is. The programmer
// expected polymorphic behavior but got static binding.
//
// Adding 'virtual' to opcodeName() fixes this: the call inst.opcodeName()
// below dispatches through the vtable to whichever concrete class 'inst' is.
// describe() is written ONCE and works for AddInst, LoadInst, BranchInst, and
// any future derived class. This is RUNTIME POLYMORPHISM. (notes 25.2)
std::string describe(const Inst& inst)
{
// inst.opcodeName() — virtual late binding.
// Even though 'inst' has static type 'const Inst&', the vtable pointer
// stored in the actual object routes this call to the derived override.
return "inst: " + inst.opcodeName();
}
// ─── TASK 5: asBranch() — LLVM-style safe downcast ──────────────────────────
// dynamic_cast<T*>(ptr) performs a RUNTIME type check:
// • If the object is-a BranchInst (or a subtype of it) -> return T* (valid).
// • Otherwise -> return nullptr.
//
// Always use the POINTER form when you want nullptr on failure; the REFERENCE
// form throws std::bad_cast instead — appropriate if failure is truly
// unexpected, but harder to handle in a general helper. (notes 25.10)
//
// In real LLVM you write:
// if (auto *BI = dyn_cast<BranchInst>(&I)) { use BI; }
// which is exactly this function's pattern. (llvm-idioms.md)
//
// Note: dynamic_cast requires a POLYMORPHIC type (one with at least one
// virtual function) — Inst qualifies because of opcodeName() and ~Inst().
BranchInst* asBranch(Inst* inst)
{
return dynamic_cast<BranchInst*>(inst); // returns nullptr on mismatch
}
// ─── TASK 6: countOpcode() — walk a polymorphic instruction vector ───────────
// Iterate over the vector of unique_ptr<Inst>. Dereference each smart pointer
// to get a const Inst& and call opcodeName() — virtual dispatch picks the right
// concrete opcode string. Count how many match the requested opcode.
//
// WHY unique_ptr<Inst> instead of std::vector<Inst>?
// Storing by value would SLICE: when you push_back(AddInst{}) into a
// vector<Inst>, only the Inst sub-object is copied; the AddInst-specific
// vtable pointer and any AddInst data are left behind. The object in the
// vector becomes a plain Inst with no opcodeName override. Using smart
// pointers (owning pointers) avoids slicing: the full derived object lives on
// the heap, and the pointer just references it. (notes 25.9)
//
// In real LLVM: for (Instruction &I : BB) { if (I.getOpcodeName() == opcode) ++n; }
// (llvm-idioms.md "Walk every Instruction in a BasicBlock")
int countOpcode(const std::vector<std::unique_ptr<Inst>>& module,
std::string_view opcode)
{
int count { 0 };
for (const auto& ptr : module) // ptr is const unique_ptr<Inst>&
{
// *ptr dereferences to const Inst& — virtual dispatch on opcodeName()
if (ptr->opcodeName() == opcode)
++count;
}
return count;
}
// Chapter 25 — Virtual Functions · MINI-LLVM (GRADER)
// ─────────────────────────────────────────────────────────────────────────────
// Tiny no-framework unit-test harness (same style as drills/CLAUDE.md).
// Includes ../inst.h and calls the API across many deterministic inputs.
// Each failing CHECK prints its expression and line number.
// Any failure -> non-zero exit -> `make test` is RED.
//
// The Makefile links this file against starter/inst.cpp (`make test`) or
// solution/inst.cpp (`make test-solution`).
// ─────────────────────────────────────────────────────────────────────────────
#include <iostream>
#include <memory>
#include <string>
#include "../inst.h"
static int fails = 0;
// CHECK: assert a boolean condition; on failure, report what and where.
#define CHECK(cond) \
do { if(!(cond)){ std::cerr << "FAIL: " #cond " @line " << __LINE__ << "\n"; ++fails; } } while(0)
int main()
{
// ─────────────────────────────────────────────────────────────────────────
// TASK 2: opcodeName() — virtual dispatch returns the correct string
// ─────────────────────────────────────────────────────────────────────────
// First, call directly on concrete objects (static binding — baseline).
{
AddInst add {};
LoadInst load {};
BranchInst br {};
CHECK(add.opcodeName() == "add");
CHECK(load.opcodeName() == "load");
CHECK(br.opcodeName() == "br");
}
// Now call through a BASE POINTER — the whole point of virtual dispatch.
// Even though the static type of 'p' is Inst*, the vtable must route each
// call to the most-derived override. (notes 25.2)
{
Inst* p1 = new AddInst {};
Inst* p2 = new LoadInst {};
Inst* p3 = new BranchInst {};
CHECK(p1->opcodeName() == "add"); // Inst* -> AddInst override
CHECK(p2->opcodeName() == "load"); // Inst* -> LoadInst override
CHECK(p3->opcodeName() == "br"); // Inst* -> BranchInst override
// Clean up via base pointer — requires virtual destructor (TASK 1/3).
// If ~Inst() is not virtual these deletes only call Inst::~Inst() and
// s_destroyed stays 0.
delete p1;
delete p2;
delete p3;
}
// ─────────────────────────────────────────────────────────────────────────
// TASK 1 + 3: Virtual destructor proof
// Delete through a BASE POINTER and verify the DERIVED dtor ran.
//
// We use unique_ptr<Inst> here (Ch 22 smart pointer — in scope, provided as
// scaffolding) which calls delete base_ptr exactly as raw new/delete would,
// but safely. unique_ptr is NOT ahead of scope: Ch 22 <= Ch 25.
// ─────────────────────────────────────────────────────────────────────────
Inst::s_destroyed = 0; // reset before the proof sequence
{
// NOTE the STATIC type: std::unique_ptr<Inst>, not <AddInst>. The held
// object is an AddInst, but the unique_ptr's deleter calls
// delete (Inst*)raw
// i.e. delete through a BASE pointer. Only a VIRTUAL ~Inst() routes that
// to ~AddInst(). (If we wrote unique_ptr<AddInst>, the static type would
// already be AddInst and the test would pass even with a non-virtual base
// dtor — proving nothing. The base-pointer type is what makes this a real
// proof.)
std::unique_ptr<Inst> p = std::make_unique<AddInst>();
CHECK(Inst::s_destroyed == 0); // dtor hasn't run yet
} // unique_ptr<Inst> out of scope -> delete Inst* -> vtable -> ~AddInst() -> ++s_destroyed
CHECK(Inst::s_destroyed == 1); // derived dtor MUST have fired through the base ptr
{
std::unique_ptr<Inst> p = std::make_unique<LoadInst>();
}
CHECK(Inst::s_destroyed == 2); // second derived dtor, also via unique_ptr<Inst>
{
std::unique_ptr<Inst> p = std::make_unique<BranchInst>();
}
CHECK(Inst::s_destroyed == 3); // third derived dtor, also via unique_ptr<Inst>
// Edge: delete nullptr through a base pointer — should be a no-op.
{
Inst* null_ptr = nullptr;
delete null_ptr; // well-defined; should NOT increment s_destroyed
}
CHECK(Inst::s_destroyed == 3); // unchanged after deleting nullptr
// ─────────────────────────────────────────────────────────────────────────
// TASK 4: describe() — polymorphic free function (Ch24 cliff resolved)
// ─────────────────────────────────────────────────────────────────────────
{
AddInst add {};
LoadInst load {};
BranchInst br {};
// Call through concrete objects directly.
CHECK(describe(add) == "inst: add");
CHECK(describe(load) == "inst: load");
CHECK(describe(br) == "inst: br");
// Call through a BASE REFERENCE — this is where virtual matters.
// Static type: const Inst&; dynamic type: AddInst/LoadInst/BranchInst.
Inst& ref_add = add;
Inst& ref_load = load;
Inst& ref_br = br;
CHECK(describe(ref_add) == "inst: add");
CHECK(describe(ref_load) == "inst: load");
CHECK(describe(ref_br) == "inst: br");
}
// ─────────────────────────────────────────────────────────────────────────
// TASK 5: asBranch() — dynamic_cast pointer form
// ─────────────────────────────────────────────────────────────────────────
{
AddInst add {};
LoadInst load {};
BranchInst br {};
// Non-branch instructions must yield nullptr — no crash, just null.
CHECK(asBranch(&add) == nullptr);
CHECK(asBranch(&load) == nullptr);
// A BranchInst must succeed and return a valid pointer.
BranchInst* result = asBranch(&br);
CHECK(result != nullptr);
// Verify we got the right object back (pointer identity).
CHECK(result == &br);
// Exercise BranchInst-specific API through the downcast pointer.
CHECK(result->isUnconditional() == true);
// Edge: nullptr input must not crash; return nullptr.
CHECK(asBranch(nullptr) == nullptr);
// Through a base pointer (the typical real-world pattern).
Inst* base_ptr_br = &br;
Inst* base_ptr_add = &add;
CHECK(asBranch(base_ptr_br) != nullptr); // is a BranchInst
CHECK(asBranch(base_ptr_add) == nullptr); // is NOT a BranchInst
}
// ─────────────────────────────────────────────────────────────────────────
// TASK 6: countOpcode() — walk a polymorphic instruction vector
// ─────────────────────────────────────────────────────────────────────────
{
std::vector<std::unique_ptr<Inst>> module {};
module.push_back(std::make_unique<AddInst>());
module.push_back(std::make_unique<AddInst>());
module.push_back(std::make_unique<LoadInst>());
module.push_back(std::make_unique<BranchInst>());
CHECK(countOpcode(module, "add") == 2);
CHECK(countOpcode(module, "load") == 1);
CHECK(countOpcode(module, "br") == 1);
CHECK(countOpcode(module, "mul") == 0); // not present -> 0
// Edge: empty module.
std::vector<std::unique_ptr<Inst>> empty {};
CHECK(countOpcode(empty, "add") == 0);
// Edge: all same opcode.
std::vector<std::unique_ptr<Inst>> all_adds {};
all_adds.push_back(std::make_unique<AddInst>());
all_adds.push_back(std::make_unique<AddInst>());
all_adds.push_back(std::make_unique<AddInst>());
CHECK(countOpcode(all_adds, "add") == 3);
CHECK(countOpcode(all_adds, "load") == 0);
// Edge: single instruction.
std::vector<std::unique_ptr<Inst>> singleton {};
singleton.push_back(std::make_unique<BranchInst>());
CHECK(countOpcode(singleton, "br") == 1);
CHECK(countOpcode(singleton, "add") == 0);
}
// ─────────────────────────────────────────────────────────────────────────
// OBJECT SLICING DEMO — no graded task; demonstrates WHY slicing would
// break everything and why the vector stores pointers, not values.
//
// We cannot put an abstract-base Inst by value in a vector<Inst> because
// Inst has pure virtual functions — the compiler would reject it outright.
// The section in the README shows slicing with a CONCRETE pair (notes 25.9)
// so the learner can SEE the behavior without UB.
// ─────────────────────────────────────────────────────────────────────────
// ─────────────────────────────────────────────────────────────────────────
// Integrated dispatch round-trip: combine describe + countOpcode
// ─────────────────────────────────────────────────────────────────────────
{
// Build a mini "basic block": two adds, one load, one branch.
std::vector<std::unique_ptr<Inst>> bb {};
bb.push_back(std::make_unique<AddInst>());
bb.push_back(std::make_unique<LoadInst>());
bb.push_back(std::make_unique<AddInst>());
bb.push_back(std::make_unique<BranchInst>());
// describe() must work through the unique_ptr's operator->
CHECK(describe(*bb[0]) == "inst: add");
CHECK(describe(*bb[1]) == "inst: load");
CHECK(describe(*bb[2]) == "inst: add");
CHECK(describe(*bb[3]) == "inst: br");
// countOpcode round-trip.
CHECK(countOpcode(bb, "add") == 2);
CHECK(countOpcode(bb, "load") == 1);
CHECK(countOpcode(bb, "br") == 1);
// asBranch must succeed for the branch and fail for others.
CHECK(asBranch(bb[0].get()) == nullptr); // AddInst
CHECK(asBranch(bb[3].get()) != nullptr); // BranchInst
}
// ─────────────────────────────────────────────────────────────────────────
if (!fails)
std::cout << "PASS \xE2\x9C\x85 all MINI-LLVM checks passed.\n";
else
std::cerr << "\nFAIL \xE2\x9D\x8C " << fails
<< " check(s) failed \xe2\x80\x94 fix the TASK blocks in starter/inst.cpp.\n";
return fails ? 1 : 0;
}
// Chapter 25 — Virtual Functions · MINI-LLVM: inst.h
// ─────────────────────────────────────────────────────────────────────────────
// PUBLIC INTERFACE — declarations only. The learner implements these in
// starter/inst.cpp; tests and the demo include this header and call through it.
//
// DESIGN NOTE: This file intentionally mirrors LLVM's own Instruction
// hierarchy (llvm::Instruction -> llvm::BinaryOperator, llvm::LoadInst,
// llvm::BranchInst …). The names you learn here map directly to the real API
// you will read in CS6340 labs:
//
// This exercise Real LLVM
// ────────────────── ─────────────────────────────────────────────
// Inst llvm::Instruction
// Inst::opcodeName() Instruction::getOpcodeName()
// AddInst llvm::BinaryOperator (with Add opcode)
// LoadInst llvm::LoadInst
// BranchInst llvm::BranchInst
// asBranch(p) llvm::dyn_cast<BranchInst>(&I) [llvm-idioms.md]
// countOpcode(…) a walk of Module → Function → BB → Instruction
//
// ─────────────────────────────────────────────────────────────────────────────
#ifndef INST_H
#define INST_H
#include <memory> // std::unique_ptr (Ch 22 — provided scaffolding)
#include <string> // std::string
#include <string_view> // std::string_view
#include <vector> // std::vector (Ch 16)
// ─────────────────────────────────────────────────────────────────────────────
// Inst — abstract base class for every MINI-LLVM instruction.
//
// "Abstract" means it has at least one pure virtual function (= 0), so you
// cannot instantiate Inst directly — you can only hold Inst* or Inst& pointing
// at a CONCRETE derived object. (notes 25.7)
//
// The VIRTUAL DESTRUCTOR is mandatory for any class deleted through a base
// pointer. Without it, 'delete base_ptr' would call only ~Inst(), leaking the
// derived object's resources. (notes 25.4)
// ─────────────────────────────────────────────────────────────────────────────
class Inst
{
public:
// ── Virtual destructor (TASK 1) ──────────────────────────────────────────
// Must be virtual so that 'delete inst_ptr' calls the derived destructor.
// Declare it here; define the body in inst.cpp. Track destruction with the
// static counter s_destroyed so tests can PROVE it fired through a base ptr.
virtual ~Inst();
// ── Pure virtual function (TASK 2) ───────────────────────────────────────
// Every concrete Inst MUST supply its opcode name.
// "Pure virtual" (= 0) makes Inst an ABSTRACT BASE CLASS. (notes 25.7)
virtual std::string opcodeName() const = 0;
// ── describe() free function (declared below) will call this through Inst& ─
// ── Static destroyed counter — lets tests verify the virtual dtor ran ────
// NOT a TASK block; this is provided scaffolding for the test harness.
static int s_destroyed; // definition lives in inst.cpp
};
// ─────────────────────────────────────────────────────────────────────────────
// AddInst — represents an integer addition instruction.
// In real LLVM: llvm::BinaryOperator with opcode Instruction::Add
// ─────────────────────────────────────────────────────────────────────────────
class AddInst : public Inst
{
public:
// ── TASK 2 (continued) ───────────────────────────────────────────────────
// Override opcodeName() to return "add".
// The 'override' specifier tells the compiler to verify this matches the
// base virtual signature — a typo (wrong const, wrong return type) becomes
// a compile error, not a silent new function. (notes 25.3)
std::string opcodeName() const override;
// ── TASK 3 ───────────────────────────────────────────────────────────────
// Override ~AddInst() — it must increment Inst::s_destroyed so the grader
// can confirm the derived dtor ran when deleted through a base pointer.
~AddInst() override;
};
// ─────────────────────────────────────────────────────────────────────────────
// LoadInst — represents a memory-load instruction.
// In real LLVM: llvm::LoadInst
// ─────────────────────────────────────────────────────────────────────────────
class LoadInst : public Inst
{
public:
// ── TASK 2 (continued) ───────────────────────────────────────────────────
std::string opcodeName() const override;
// ── TASK 3 ───────────────────────────────────────────────────────────────
~LoadInst() override;
};
// ─────────────────────────────────────────────────────────────────────────────
// BranchInst — represents a branch instruction; always the last in a block.
// In real LLVM: llvm::BranchInst
//
// 'final' on the override means NO class may further override opcodeName()
// by inheriting from BranchInst. Use it when the hierarchy should stop here.
// (notes 25.3)
// ─────────────────────────────────────────────────────────────────────────────
class BranchInst : public Inst
{
public:
// ── TASK 2 (continued) ───────────────────────────────────────────────────
// 'final' locks this override — no subclass of BranchInst can change it.
std::string opcodeName() const override final;
// ── TASK 3 ───────────────────────────────────────────────────────────────
~BranchInst() override;
// BranchInst-specific API (downcast target in TASK 5):
bool isUnconditional() const { return true; } // simplified — always true here
};
// ─────────────────────────────────────────────────────────────────────────────
// describe() — free function that accepts any Inst by CONST REFERENCE and
// returns a human-readable string like "add: <opcode>".
//
// The call 'inst.opcodeName()' inside describe() dispatches POLYMORPHICALLY
// (virtual late binding through the reference) — this is the Ch24 cliffhanger
// RESOLVED. In Ch24, non-virtual functions on a Base& would always call the
// base version; with 'virtual', the actual derived override runs. (notes 25.2)
// ─────────────────────────────────────────────────────────────────────────────
// ── TASK 4 (declared here, implemented in inst.cpp) ──────────────────────────
std::string describe(const Inst& inst);
// ─────────────────────────────────────────────────────────────────────────────
// asBranch() — safe downcast from Inst* to BranchInst*.
//
// Returns a valid BranchInst* when the pointed-at object IS a BranchInst;
// returns nullptr otherwise. This is the C++ equivalent of LLVM's own
// dyn_cast<BranchInst>(&inst) — see llvm-idioms.md "dyn_cast<>".
//
// IMPORTANT: we use the POINTER form of dynamic_cast so failure gives nullptr,
// not an exception. (The reference form throws std::bad_cast — notes 25.10.)
// ─────────────────────────────────────────────────────────────────────────────
// ── TASK 5 (declared here, implemented in inst.cpp) ──────────────────────────
BranchInst* asBranch(Inst* inst);
// ─────────────────────────────────────────────────────────────────────────────
// countOpcode() — walk a "module" (a vector of owned instructions) and count
// how many have a given opcode name.
//
// This mirrors Lab-0's Module → BB → Instruction walk without needing LLVM.
// The vector stores std::unique_ptr<Inst> (Ch 22) to OWN the objects and avoid
// slicing: std::vector<Inst> would SLICE derived objects on insertion, erasing
// the dynamic type entirely. (notes 25.9)
// ─────────────────────────────────────────────────────────────────────────────
// ── TASK 6 (declared here, implemented in inst.cpp) ──────────────────────────
int countOpcode(const std::vector<std::unique_ptr<Inst>>& module,
std::string_view opcode);
#endif // INST_H
# Chapter 25 — Virtual Functions · MINI-LLVM · unit-test grader (Style B).
# Targets follow the drills/CLAUDE.md Makefile contract. TABS, not spaces.
#
# Layout:
# inst.h — declarations only (provided, complete)
# starter/inst.cpp — bodies with TASK blocks (learner fills in)
# solution/inst.cpp — reference bodies
# tests/tests.cpp — includes ../inst.h, calls the API, CHECKs results
#
# -I. puts the chapter root on the include path so every file can write:
# #include "inst.h"
# (mirrors how LLVM passes find their own headers via -I<pass_root>)
CXX := clang++
CXXFLAGS := -std=c++17 -Wall -Wextra -I.
.PHONY: all build run test solution test-solution clean
all: build
# ── Starter: compile-check the learner's inst.cpp (object only, no link) ─────
# Building an object file verifies the code is warning-clean without needing a
# main(). The test target does the full grader build + run.
build:
$(CXX) $(CXXFLAGS) -c starter/inst.cpp -o starter/inst.o
@echo "OK \xE2\x9C\x85 starter/inst.cpp compiles. Now run: make test"
# run — show the learner the solution in action for a quick sanity check.
run: build
@echo "(no standalone driver; run 'make test' to grade or 'make solution' to see the reference)"
# ── The grader: link the tests against the LEARNER's starter/inst.cpp ─────────
# RED until the TASK blocks are filled in; GREEN once they are.
test:
$(CXX) $(CXXFLAGS) tests/tests.cpp starter/inst.cpp -o tests/run
@./tests/run || echo "FAIL \xE2\x9D\x8C fill in the TASK blocks in starter/inst.cpp until every check passes."
# ── The reference solution: build + run as a proof ───────────────────────────
solution:
$(CXX) $(CXXFLAGS) tests/tests.cpp solution/inst.cpp -o solution/run
./solution/run
# ── Proof the exercise is solvable: the reference MUST pass every check ───────
test-solution:
$(CXX) $(CXXFLAGS) tests/tests.cpp solution/inst.cpp -o tests/run
@./tests/run
clean:
rm -f starter/inst.o tests/run solution/run
rm -rf tests/run.dSYM solution/run.dSYM
make test locally
(see “Build & run locally” above).