ROSE 編譯器框架/通用資料流框架
隨著 ROSE 專案的進行,我們收集了相當多的資料流分析版本。維護和使用它們很痛苦,因為它們
- 重複迭代不動點演算法,
- 分散在不同的目錄中,
- 使用不同的結果表示形式,以及
- 具有不同的成熟度和魯棒性。
一項正在進行的工作是將所有資料流分析工作整合到一個單一的框架中。
快速事實
- 原始作者:Greg Bronevetsky
- 程式碼管理員:廖春華
- 文件
- 原始碼:./src/midend/programAnalysis/genericDataflow 下的檔案
- 測試:tests/roseTests/programAnalysisTests/generalDataFlowAnalysisTests
列表
- 常量傳播
- 支配者分析:dominatorAnalysis.h dominatorAnalysis.C
- 活躍/死變數分析,或活性分析:liveDeadVarAnalysis.h liveDeadVarAnalysis.C
- 指標分析
函式和節點狀態是執行資料流分析的兩個必需引數
它們一起儲存在函式狀態內部 //functionState.h
functionState.h
genericDataflow/cfgUtils/CallGraphTraverse.h
函式的抽象,在內部連線到 SgFunctionDeclaration *decl
宣告在 ./src/midend/programAnalysis/genericDataflow/cfgUtils/CallGraphTraverse.h 中
建構函式
- Function::Function(string name) 基於函式名
- Function::Function(SgFunctionDeclaration* sample) // 核心建構函式
- Function::Function(SgFunctionDefinition* sample)
CGFunction* cgFunc; // 呼叫圖函式
Function func(cgFunc);
與 CFG 節點相關的任何資訊。
- 它沒有資料流的輸入/輸出概念
- 在資料流分析期間不打算演變
class NodeFact: public printable
{
public:
// returns a copy of this node fact
virtual NodeFact* copy() const=0;
};
儲存有關多個分析及其對應格的資訊,用於給定節點(CFG 節點??)
./src/midend/programAnalysis/genericDataflow/state/nodeState.h
它還提供靜態函式來
- 為所有 DataflowNode 初始化節點狀態
- 為給定的 DataflowNode 檢索節點狀態
class NodeState
{
// internal types: map between analysis and set of lattices
typedef std::map<Analysis*, std::vector<Lattice*> > LatticeMap;
typedef std::map<Analysis*, std::vector<NodeFact*> > NodeFactMap;
typedef std::map<Analysis*, bool > BoolMap;
// the dataflow information Above the node, for each analysis that
// may be interested in the current node
LatticeMap dfInfoAbove; // IN set in a dataflow
// the Analysis information Below the node, for each analysis that
// may be interested in the current node
LatticeMap dfInfoBelow; // OUT set in a dataflow,
// the facts that are true at this node, for each analysis that
// may be interested in the current node
NodeFactMap facts;
// Contains all the Analyses that have initialized their state at this node. It is a map because
// TBB doesn't provide a concurrent set.
BoolMap initializedAnalyses;
// static interfaces
// returns the NodeState object associated with the given dataflow node.
// index is used when multiple NodeState objects are associated with a given node
// (ex: SgFunctionCallExp has 3 NodeStates: entry, function body, exit)
static NodeState* getNodeState(const DataflowNode& n, int index=0);
// most useful interface: retrieve the lattices (could be only one) associated with a given analysis
// returns the map containing all the lattices from above the node that are owned by the given analysis
// (read-only access)
const std::vector<Lattice*>& getLatticeAbove(const Analysis* analysis) const;
// returns the map containing all the lattices from below the node that are owned by the given analysis
// (read-only access)
const std::vector<Lattice*>& getLatticeBelow(const Analysis* analysis) const;
}
./src/midend/programAnalysis/genericDataflow/state/functionState.h
函式和節點狀態的一對。
- 它提供靜態函式來初始化所有函式狀態並檢索函式狀態
class FunctionState
{
friend class CollectFunctions;
public:
Function func;
NodeState state;
// The lattices that describe the value of the function's return variables
NodeState retState;
private:
static std::set<FunctionState*> allDefinedFuncs;
static std::set<FunctionState*> allFuncs;
static bool allFuncsComputed;
public:
FunctionState(Function &func):
func(func),
state(/*func.get_declaration()->cfgForBeginning()*/)
{}
// We should use this interface --------------
// 1. returns a set of all the functions whose bodies are in the project
static std::set<FunctionState*>& getAllDefinedFuncs();
// 2. returns the FunctionState associated with the given function
// func may be any declared function
static FunctionState* getFuncState(const Function& func);
...
}
FunctionState* fs = new FunctionState(func); // 從函式狀態到節點狀態的空
/*************************************
*** UnstructuredPassInterAnalysis ***
*************************************/
void UnstructuredPassInterAnalysis::runAnalysis()
{
set<FunctionState*> allFuncs = FunctionState::getAllDefinedFuncs(); // call a static function to get all function state s
// Go through functions one by one, call an intra-procedural analysis on each of them
// iterate over all functions with bodies
for(set<FunctionState*>::iterator it=allFuncs.begin(); it!=allFuncs.end(); it++)
{
FunctionState* fState = *it;
intraAnalysis->runAnalysis(fState->func, &(fState->state));
}
}
// runs the intra-procedural analysis on the given function, returns true if
// the function's NodeState gets modified as a result and false otherwise
// state - the function's NodeState
bool UnstructuredPassIntraAnalysis::runAnalysis(const Function& func, NodeState* state)
{
DataflowNode funcCFGStart = cfgUtils::getFuncStartCFG(func.get_definition(),filter);
DataflowNode funcCFGEnd = cfgUtils::getFuncEndCFG(func.get_definition(), filter);
if(analysisDebugLevel>=2)
Dbg::dbg << "UnstructuredPassIntraAnalysis::runAnalysis() function "<<func.get_name().getString()<<"()\n";
// iterate over all the nodes in this function
for(VirtualCFG::iterator it(funcCFGStart); it!=VirtualCFG::dataflow::end(); it++)
{
DataflowNode n = *it;
// The number of NodeStates associated with the given dataflow node
//int numStates=NodeState::numNodeStates(n);
// The actual NodeStates associated with the given dataflow node
const vector<NodeState*> nodeStates = NodeState::getNodeStates(n);
// Visit each CFG node
for(vector<NodeState*>::const_iterator itS = nodeStates.begin(); itS!=nodeStates.end(); itS++)
visit(func, n, *(*itS));
}
return false;
}
示例:檢索活性分析的輸入格
void getAllLiveVarsAt(LiveDeadVarsAnalysis* ldva, const NodeState& state, set<varID>& vars, string indent)
- LiveVarsLattice* liveLAbove = dynamic_cast<LiveVarsLattice*>(*(state.getLatticeAbove(ldva).begin()));
警告:格與格值
- 根據定義,格是一組值。但是,通用資料流框架中格型別的例項也用於表示格中的單個值。對於此混淆,我們深感抱歉。我們歡迎您提出改進建議。
請參閱 ROSE 編譯器框架/格
儲存附加到 CFG 節點的資料流分析資訊。
基本操作
- 儲存什麼:格值集、底部、頂部以及介於兩者之間的任何內容
- 初始化:LiveDeadVarsAnalysis::genInitState()
- 建立:轉移函式
- 相遇操作:格的成員函式
示例
- 活性分析:CFG 節點入口處的活躍變數集
- 常量傳播:格值從無資訊(底部)-> 未知 -> 常量 -> 過多資訊(衝突的常量值,頂部),
// blindly add all of that_arg's values into current lattice's value set void LiveVarsLattice::incorporateVars(Lattice* that_arg) // retrieve a subset lattice information for a given expr. This lattice may contain more information than those about a given expr. Lattice* LiveVarsLattice::project(SgExpression* expr) // add lattice (exprState)information about expr into current lattice's value set: default implementation just calls meetUpdate(exprState) bool LiveVarsLattice::unProject(SgExpression* expr, Lattice* exprState)
該概念基於原始 CFG 流方向
- 上方:傳入邊方向
- 下方:傳出邊方向
輸入和輸出取決於問題的方向,前向與後向
- 前向方向:輸入 == 上方格,輸出 = 下方格
- 後向方向:輸入 == 下方格,輸出 = 上方格
該框架提供了一些預定義的格,可供使用。
lattice.h/latticeFull.h
- BoolAndLattice
class LiveVarsLattice : public FiniteLattice
{
public:
std::set<varID> liveVars; // bottom is all live variables, top is the empty set, meet brings down the lattice -> union of variables.
...
};
// Meet operation: simplest set union of two lattices:
// computes the meet of this and that and saves the result in this
// returns true if this causes this to change and false otherwise
bool LiveVarsLattice::meetUpdate(Lattice* that_arg)
{
bool modified = false;
LiveVarsLattice* that = dynamic_cast<LiveVarsLattice*>(that_arg);
// Add all variables from that to this
for(set<varID>::iterator var=that->liveVars.begin(); var!=that->liveVars.end(); var++) {
// If this lattice doesn't yet record *var as being live
if(liveVars.find(*var) == liveVars.end()) { // this if () statement gives a chance to set the modified flag.
// otherwise, liveVars.insert() can be directly called.
modified = true;
liveVars.insert(*var);
}
}
return modified;
}
基礎:Data_flow_analysis#flow.2Ftransfer_function
- IN = OUT(所有前驅節點)的並集
- OUT = GEN + (IN - KILL)
程式結構對當前格的影響(如何改變當前格)。
- 格:儲存IN和OUT資訊
- 傳遞函式內部需要額外的成員變數來儲存GEN和KILL集合。
類層次結構
class IntraDFTransferVisitor : public ROSE_VisitorPatternDefaultBase
{
protected:
// Common arguments to the underlying transfer function
const Function &func; // which function are we talking about
const DataflowNode &dfNode; // wrapper of CFGNode
NodeState &nodeState; // lattice element state, context information?
const std::vector<Lattice*> &dfInfo; // data flow information
public:
IntraDFTransferVisitor(const Function &f, const DataflowNode &n, NodeState &s, const std::vector<Lattice*> &d)
: func(f), dfNode(n), nodeState(s), dfInfo(d)
{ }
virtual bool finish() = 0;
virtual ~IntraDFTransferVisitor() { }
};
class LiveDeadVarsTransfer : public IntraDFTransferVisitor
{
};
class ConstantPropagationAnalysisTransfer : public VariableStateTransfer<ConstantPropagationLattice>
{}
template <class LatticeType>
class VariableStateTransfer : public IntraDFTransferVisitor
{
...
};
class ConstantPropagationAnalysisTransfer : public VariableStateTransfer<ConstantPropagationLattice> {};
void
ConstantPropagationAnalysisTransfer::visit(SgIntVal *sgn)
{
ROSE_ASSERT(sgn != NULL);
ConstantPropagationLattice* resLat = getLattice(sgn);
ROSE_ASSERT(resLat != NULL);
resLat->setValue(sgn->get_value());
resLat->setLevel(ConstantPropagationLattice::constantValue);
}
將程式點轉換為生成器和KILL集合的函式。用於活躍性分析
- Kill (s) = {在s中被定義的變數}://
- Gen (s) = {在s中被使用的變數}
OUT = IN - KILL + GEN
- OUT初始化為IN集合,
- 傳遞函式將應用-KILL + GEN
class LiveDeadVarsTransfer : public IntraDFTransferVisitor
{
LiveVarsLattice* liveLat; // the result of this analysis
bool modified;
// Expressions that are assigned by the current operation
std::set<SgExpression*> assignedExprs; // KILL () set
// Variables that are assigned by the current operation
std::set<varID> assignedVars;
// Variables that are used/read by the current operation
std::set<varID> usedVars; // GEN () set
public:
LiveDeadVarsTransfer(const Function &f, const DataflowNode &n, NodeState &s, const std::vector<Lattice*> &d, funcSideEffectUses *fseu_)
: IntraDFTransferVisitor(f, n, s, d), indent(" "), liveLat(dynamic_cast<LiveVarsLattice*>(*(dfInfo.begin()))), modified(false), fseu(fseu_)
{
if(liveDeadAnalysisDebugLevel>=1) Dbg::dbg << indent << "liveLat="<<liveLat->str(indent + " ")<<std::endl;
// Make sure that all the lattice is initialized
liveLat->initialize();
}
bool finish();
// operationg on different AST nodes
void visit(SgExpression *);
void visit(SgInitializedName *);
void visit(SgReturnStmt *);
void visit(SgExprStatement *);
void visit(SgCaseOptionStmt *);
void visit(SgIfStmt *);
void visit(SgForStatement *);
void visit(SgWhileStmt *);
void visit(SgDoWhileStmt *);
}
// Helper transfer function, focusing on handling expressions.
// live dead variable analysis: LDVA,
// expression transfer: transfer functions for expressions
/// Visits live expressions - helper to LiveDeadVarsTransfer
class LDVAExpressionTransfer : public ROSE_VisitorPatternDefaultBase
{
LiveDeadVarsTransfer &ldva;
public:
// Plain assignment: lhs = rhs, set GEN (read/used) and KILL (written/assigned) sets
void visit(SgAssignOp *sgn) {
ldva.assignedExprs.insert(sgn->get_lhs_operand());
// If the lhs of the assignment is a complex expression (i.e. it refers to a variable that may be live) OR
// if is a known expression that is known to may-be-live
// THIS CODE ONLY APPLIES TO RHSs THAT ARE SIDE-EFFECT-FREE AND WE DON'T HAVE AN ANALYSIS FOR THAT YET
/*if(!isVarExpr(sgn->get_lhs_operand()) ||
(isVarExpr(sgn->get_lhs_operand()) &&
liveLat->isLiveVar(SgExpr2Var(sgn->get_lhs_operand()))))
{ */
ldva.used(sgn->get_rhs_operand());
}
...
}
(gdb) bt
#0 LDVAExpressionTransfer::visit (this=0x7fffffffcea0, sgn=0xa20320)
at ../../../../sourcetree/src/midend/programAnalysis/genericDataflow/simpleAnalyses/liveDeadVarAnalysis.C:228
#1 0x00002aaaac3d9968 in SgAssignOp::accept (this=0xa20320, visitor=...) at Cxx_Grammar.C:143069
#2 0x00002aaaadc61c04 in LiveDeadVarsTransfer::visit (this=0xaf9e00, sgn=0xa20320)
at ../../../../sourcetree/src/midend/programAnalysis/genericDataflow/simpleAnalyses/liveDeadVarAnalysis.C:384
#3 0x00002aaaadbbaef0 in ROSE_VisitorPatternDefaultBase::visit (this=0xaf9e00, variable_SgBinaryOp=0xa20320) at ../../../src/frontend/SageIII/Cxx_Grammar.h:316006
#4 0x00002aaaadbba04a in ROSE_VisitorPatternDefaultBase::visit (this=0xaf9e00, variable_SgAssignOp=0xa20320) at ../../../src/frontend/SageIII/Cxx_Grammar.h:315931
#5 0x00002aaaac3d9968 in SgAssignOp::accept (this=0xa20320, visitor=...) at Cxx_Grammar.C:143069
#6 0x00002aaaadbcca0a in IntraUniDirectionalDataflow::runAnalysis (this=0x7fffffffd9f0, func=..., fState=0xafbd18, analyzeDueToCallers=true, calleesUpdated=...)
at ../../../../sourcetree/src/midend/programAnalysis/genericDataflow/analysis/dataflow.C:282
#7 0x00002aaaadbbf444 in IntraProceduralDataflow::runAnalysis (this=0x7fffffffda00, func=..., state=0xafbd18)
at ../../../../sourcetree/src/midend/programAnalysis/genericDataflow/analysis/dataflow.h:74
#8 0x00002aaaadbb0966 in UnstructuredPassInterDataflow::runAnalysis (this=0x7fffffffda50)
at ../../../../sourcetree/src/midend/programAnalysis/genericDataflow/analysis/analysis.C:467
#9 0x000000000040381a in main (argc=2, argv=0x7fffffffdba8)
at ../../../../../sourcetree/tests/roseTests/programAnalysisTests/generalDataFlowAnalysisTests/liveDeadVarAnalysisTest.C:101
通用資料流框架在ROSE中工作於虛擬控制流圖。
原始的虛擬CFG對於所有型別的分析可能並不理想,因為它可能包含太多與問題無關的管理節點。
因此,框架向Analysis類提供了一個過濾器引數。除非您指定自己的過濾器,否則將使用預設過濾器。
// Example filter funtion deciding if a CFGnNode should show up or not
bool gfilter (CFGNode cfgn)
{
SgNode *node = cfgn.getNode();
switch (node->variantT())
{
//Keep the last index for initialized names. This way the def of the variable doesn't propagate to its assign initializer.
case V_SgInitializedName:
return (cfgn == node->cfgForEnd());
// For function calls, we only keep the last node. The function is actually called after all its parameters are evaluated.
case V_SgFunctionCallExp:
return (cfgn == node->cfgForEnd());
//For basic blocks and other "container" nodes, keep the node that appears before the contents are executed
case V_SgBasicBlock:
case V_SgExprStatement:
case V_SgCommaOpExp:
return (cfgn == node->cfgForBeginning());
// Must have a default case: return interesting CFGNode by default in this example
default:
return cfgn.isInteresting();
}
}
// Code using the filter function
int
main( int argc, char * argv[] )
{
SgProject* project = frontend(argc,argv);
initAnalysis(project);
LiveDeadVarsAnalysis ldva(project);
ldva.filter = gfilter; // set the filter to be your own one
UnstructuredPassInterDataflow ciipd_ldva(&ldva);
ciipd_ldva.runAnalysis();
....
}
關鍵函式
bool IntraUniDirectionalDataflow::runAnalysis(const Function& func, NodeState* fState, bool analyzeDueToCallers, set<Function> calleesUpdated) // analysis/dataflow.C
基本任務:透過以下方式執行分析
- 初始化資料流狀態:格和其他資訊
- 遍歷CFG:查詢當前節點的後繼節點
- 呼叫傳遞函式
- Analysis -> IntraProceduralAnalysis -> IntraProceduralDataflow -> IntraUnitDataflow --> IntraUniDirectionalDataflow (感興趣級別) -> IntraBWDataflow -> LiveDeadVarsAnalysis
class Analysis {}; // an empty abstract class for any analysis
class IntraProceduralAnalysis : virtual public Analysis //analysis/analysis.h , any intra procedural analysis, data flow or not
{
protected:
InterProceduralAnalysis* interAnalysis;
public:
void setInterAnalysis(InterProceduralAnalysis* interAnalysis) // connection to inter procedural analysis
virtual bool runAnalysis(const Function& func, NodeState* state)=0; // run this per function, NodeState stores lattices for each CFG node, etc.
virtual ~IntraProceduralAnalysis();
}
//No re-entry. analysis will be executed once??, data flow , intra-procedural analysis
// now lattices are interested
class IntraProceduralDataflow : virtual public IntraProceduralAnalysis //analysis/dataflow.h
{
// initialize lattice etc for a given dataflow node within a function
virtual void genInitState (const Function& func, const DataflowNode& n, const NodeState& state,
std::vector<Lattice*>& initLattices, std::vector<NodeFact*>& initFacts);
virtual bool runAnalysis(const Function& func, NodeState* state, bool analyzeDueToCallers, std::set<Function> calleesUpdated)=0; // the analysis on a function could be triggered by the state changes of function's callers, or callees.
std::set<Function> visited; // make sure a function is initialized once when visited multiple times
}
class IntraUnitDataflow : virtual public IntraProceduralDataflow
{
// transfer function: operate on lattices associated with a dataflow node, considering its current state
virtual bool transfer(const Function& func, const DataflowNode& n, NodeState& state, const std::vector<Lattice*>& dfInfo)=0;
};
// Uni directional dataflow: either forward or backward, but not both directions!
class IntraUniDirectionalDataflow : public IntraUnitDataflow {
public:
bool runAnalysis(const Function& func, NodeState* state, bool analyzeDueToCallers, std::set<Function> calleesUpdated);
protected:
bool propagateStateToNextNode (
const std::vector<Lattice*>& curNodeState, DataflowNode curDFNode, int nodeIndex,
const std::vector<Lattice*>& nextNodeState, DataflowNode nextDFNode);
std::vector<DataflowNode> gatherDescendants(std::vector<DataflowEdge> edges,
DataflowNode (DataflowEdge::*edgeFn)() const);
virtual NodeState*initializeFunctionNodeState(const Function &func, NodeState *fState) = 0;
virtual VirtualCFG::dataflow*
getInitialWorklist(const Function &func, bool firstVisit, bool analyzeDueToCallers, const set<Function> &calleesUpdated, NodeState *fState) = 0;
virtual vector<Lattice*> getLatticeAnte(NodeState *state) = 0;
virtual vector<Lattice*> getLatticePost(NodeState *state) = 0;
// If we're currently at a function call, use the associated inter-procedural
// analysis to determine the effect of this function call on the dataflow state.
virtual void transferFunctionCall(const Function &func, const DataflowNode &n, NodeState *state) = 0;
virtual vector<DataflowNode> getDescendants(const DataflowNode &n) = 0;
virtual DataflowNode getUltimate(const Function &func) = 0; // ultimate what? final CFG node?
};
class IntraBWDataflow : public IntraUniDirectionalDataflow {//BW: Backward
public:
IntraBWDataflow()
{}
NodeState* initializeFunctionNodeState(const Function &func, NodeState *fState);
VirtualCFG::dataflow*
getInitialWorklist(const Function &func, bool firstVisit, bool analyzeDueToCallers, const set<Function> &calleesUpdated, NodeState *fState);
virtual vector<Lattice*> getLatticeAnte(NodeState *state);
virtual vector<Lattice*> getLatticePost(NodeState *state);
void transferFunctionCall(const Function &func, const DataflowNode &n, NodeState *state);
vector<DataflowNode> getDescendants(const DataflowNode &n); // next CFG nodes, depending on the direction
{ return gatherDescendants(n.inEdges(), &DataflowEdge::source); }
DataflowNode getUltimate(const Function &func); // the last CFG should be the start CFG of the function for a backward dataflow problem
{ return cfgUtils::getFuncStartCFG(func.get_definition()); }
};
向前過程內資料流分析:例如,到達定義()
- 類IntraFWDataflow : public IntraUniDirectionalDataflow
用於初始化CFG節點的格/事實。它本身就是一個分析。非結構化傳遞
// super class: provides the driver of initialization: visit each CFG node
class UnstructuredPassIntraAnalysis : virtual public IntraProceduralAnalysis
{
public:
// call the initialization function on each CFG node
bool runAnalysis(const Function& func, NodeState* state);
// to be implemented by InitDataflowState
virtual void visit(const Function& func, const DataflowNode& n, NodeState& state)=0;
}
bool UnstructuredPassIntraAnalysis::runAnalysis(const Function& func, NodeState* state)
{
DataflowNode funcCFGStart = cfgUtils::getFuncStartCFG(func.get_definition());
DataflowNode funcCFGEnd = cfgUtils::getFuncEndCFG(func.get_definition());
if(analysisDebugLevel>=2)
Dbg::dbg << "UnstructuredPassIntraAnalysis::runAnalysis() function "<<func.get_name().getString()<<"()\n";
// iterate over all the nodes in this function
for(VirtualCFG::iterator it(funcCFGStart); it!=VirtualCFG::dataflow::end(); it++)
{
DataflowNode n = *it;
// The number of NodeStates associated with the given dataflow node
//int numStates=NodeState::numNodeStates(n);
// The actual NodeStates associated with the given dataflow node
const vector<NodeState*> nodeStates = NodeState::getNodeStates(n);
// Visit each CFG node
for(vector<NodeState*>::const_iterator itS = nodeStates.begin(); itS!=nodeStates.end(); itS++)
visit(func, n, *(*itS));
}
return false;
}
//-------------------- derived class provide link to a concrete analysis, and visit() implementation
class InitDataflowState : public UnstructuredPassIntraAnalysis
{
IntraProceduralDataflow* dfAnalysis; // link to the dataflow analysis to be initialized
public:
InitDataflowState(IntraProceduralDataflow* dfAnalysis/*, std::vector<Lattice*> &initState*/)
{
this->dfAnalysis = dfAnalysis;
}
void visit(const Function& func, const DataflowNode& n, NodeState& state);
};
void InitDataflowState::visit (const Function& func, const DataflowNode& n, NodeState& state)
{
...
dfAnalysis->genInitState(func, n, state, initLats, initFacts);
state.setLattices((Analysis*)dfAnalysis, initLats);
state.setFacts((Analysis*)dfAnalysis, initFacts);
....
}
CFG節點列表,透過迭代器介面訪問
auto_ptr<VirtualCFG::dataflow> workList(getInitialWorklist(func, firstVisit, analyzeDueToCallers, calleesUpdated, fState));
class iterator //Declared in cfgUtils/VirtualCFGIterator.h
{
public:
std::list<DataflowNode> remainingNodes;
std::set<DataflowNode> visited;
bool initialized;
protected:
// returns true if the given DataflowNode is in the remainingNodes list and false otherwise
bool isRemaining(DataflowNode n);
// advances this iterator in the given direction. Forwards if fwDir=true and backwards if fwDir=false.
// if pushAllChildren=true, all of the current node's unvisited children (predecessors or successors,
// depending on fwDir) are pushed onto remainingNodes
void advance(bool fwDir, bool pushAllChildren);
public:
virtual void operator ++ (int);
bool eq(const iterator& other_it) const;
bool operator==(const iterator& other_it) const;
bool operator!=(const iterator& it) const;
...
};
void iterator::advance(bool fwDir, bool pushAllChildren)
{
ROSE_ASSERT(initialized);
/*printf(" iterator::advance(%d) remainingNodes.size()=%d\n", fwDir, remainingNodes.size());
cout<<" visited=\n";
for(set<DataflowNode>::iterator it=visited.begin(); it!=visited.end(); it++)
cout << " <"<<it->getNode()->class_name()<<" | "<<it->getNode()<<" | "<<it->getNode()->unparseToString()<<">\n";*/
if(remainingNodes.size()>0)
{
// pop the next CFG node from the front of the list
DataflowNode cur = remainingNodes.front();
remainingNodes.pop_front();
if(pushAllChildren)
{
// find its followers (either successors or predecessors, depending on value of fwDir), push back
// those that have not yet been visited
vector<DataflowEdge> nextE;
if(fwDir)
nextE = cur.outEdges();
else
nextE = cur.inEdges();
for(vector<DataflowEdge>::iterator it=nextE.begin(); it!=nextE.end(); it++)
{
DataflowNode nextN((*it).target()/* need to put something here because DataflowNodes don't have a default constructor*/);
if(fwDir) nextN = (*it).target();
else nextN = (*it).source();
/*cout << " iterator::advance "<<(fwDir?"descendant":"predecessor")<<": "<<
"<"<<nextN.getNode()->class_name()<<" | "<<nextN.getNode()<<" | "<<nextN.getNode()->unparseToString()<<">, "<<
"visited="<<(visited.find(nextN) != visited.end())<<
" remaining="<<isRemaining(nextN)<<"\n";*/
// if we haven't yet visited this node and don't yet have it on the remainingNodes list
if(visited.find(nextN) == visited.end() &&
!isRemaining(nextN))
{
//printf(" pushing back node <%s: 0x%x: %s> visited=%d\n", nextN.getNode()->class_name().c_str(), nextN.getNode(), nextN.getNode()->unparseToString().c_str(), visited.find(nextN)!=visited.end());
remainingNodes.push_back(nextN);
}
}
}
// if we still have any nodes left remaining
if(remainingNodes.size()>0)
{
// take the next node from the front of the list and mark it as visited
//visited[remainingNodes.front()] = true;
visited.insert(remainingNodes.front());
}
}
}
class dataflow : public virtual iterator {};
class back_dataflow: public virtual dataflow {};
void back_dataflow::operator ++ (int)
{
advance(false, true); // backward, add all children
}
class IntraUniDirectionalDataflow : public IntraUnitDataflow
{ ...
virtual VirtualCFG::dataflow*
getInitialWorklist(const Function &func, bool firstVisit, bool analyzeDueToCallers, const set<Function> &calleesUpdated, NodeState *fState) = 0;
}
在派生類中實現
- VirtualCFG::dataflow* IntraFWDataflow::getInitialWorklist ()
- VirtualCFG::dataflow* IntraBWDataflow::getInitialWorklist()
b是CFG中的一個基本塊
- // 流入b的資訊是b的所有前驅節點流出的資訊的並集/連線
- // 流出S的資訊是b生成的資訊減去b殺死的資訊。這就是作用於b的傳遞函式!!
bool IntraUniDirectionalDataflow::runAnalysis(const Function& func, NodeState* fState, bool analyzeDueToCallers, set<Function> calleesUpdated)
{
// Iterate over the nodes in this function that are downstream from the nodes added above
for(; it != itEnd; it++)
{
DataflowNode n = *it;
SgNode* sgn = n.getNode();
...
for(vector<NodeState*>::const_iterator itS = nodeStates.begin(); itS!=nodeStates.end(); )
{
state = *itS;
const vector<Lattice*> dfInfoAnte = getLatticeAnte(state); // IN set
const vector<Lattice*> dfInfoPost = getLatticePost(state); // OUT set
// OUT = IN first // transfer within the node: from IN to OUT,
// Overwrite the Lattices below this node with the lattices above this node.
// The transfer function will then operate on these Lattices to produce the
// correct state below this node.
vector<Lattice*>::const_iterator itA, itP;
int j=0;
for(itA = dfInfoAnte.begin(), itP = dfInfoPost.begin();
itA != dfInfoAnte.end() && itP != dfInfoPost.end();
itA++, itP++, j++)
{
if(analysisDebugLevel>=1){ //
Dbg::dbg << " Meet Before: Lattice "<<j<<": \n "<<(*itA)->str(" ")<<endl;
Dbg::dbg << " Meet After: Lattice "<<j<<": \n "<<(*itP)->str(" ")<<endl;
}
(*itP)->copy(*itA);
/*if(analysisDebugLevel>=1){
Dbg::dbg << " Copied Meet Below: Lattice "<<j<<": \n "<<(*itB)->str(" ")<<endl;
}*/
}
// =================== TRANSFER FUNCTION ===================
// (IN - KILL ) + GEN
if (isSgFunctionCallExp(sgn))
transferFunctionCall(func, n, state);
boost::shared_ptr<IntraDFTransferVisitor> transferVisitor = getTransferVisitor(func, n, *state, dfInfoPost);
sgn->accept(*transferVisitor);
modified = transferVisitor->finish() || modified;
// =================== TRANSFER FUNCTION ===================
...//
}
}
這被證明對於沿著路徑傳播資訊至關重要。不能註釋掉!!
???不確定此步驟與之前的步驟(Meet Before() / Meet After)之間的區別
meetUpdate() 也在這裡被呼叫
// Propagates the dataflow info from the current node's NodeState (curNodeState) to the next node's
// NodeState (nextNodeState).
// Returns true if the next node's meet state is modified and false otherwise.
bool IntraUniDirectionalDataflow::propagateStateToNextNode(
const vector<Lattice*>& curNodeState, DataflowNode curNode, int curNodeIndex,
const vector<Lattice*>& nextNodeState, DataflowNode nextNode)
{
bool modified = false;
vector<Lattice*>::const_iterator itC, itN;
if(analysisDebugLevel>=1){
Dbg::dbg << "\n Propagating to Next Node: "<<nextNode.getNode()<<"["<<nextNode.getNode()->class_name()<<" | "<<Dbg::escape(nextNode.getNode()->unparseToString())<<"]"<<endl;
int j;
for(j=0, itC = curNodeState.begin(); itC != curNodeState.end(); itC++, j++)
Dbg::dbg << " Cur node: Lattice "<<j<<": \n "<<(*itC)->str(" ")<<endl;
for(j=0, itN = nextNodeState.begin(); itN != nextNodeState.end(); itN++, j++)
Dbg::dbg << " Next node: Lattice "<<j<<": \n "<<(*itN)->str(" ")<<endl;
}
// Update forward info above nextNode from the forward info below curNode.
// Compute the meet of the dataflow information along the curNode->nextNode edge with the
// next node's current state one Lattice at a time and save the result above the next node.
for(itC = curNodeState.begin(), itN = nextNodeState.begin();
itC != curNodeState.end() && itN != nextNodeState.end();
itC++, itN++)
{
// Finite Lattices can use the regular meet operator, while infinite Lattices
// must also perform widening to ensure convergence.
if((*itN)->finiteLattice())
modified = (*itN)->meetUpdate(*itC) || modified;
else
{
//InfiniteLattice* meetResult = (InfiniteLattice*)itN->second->meet(itC->second);
InfiniteLattice* meetResult = dynamic_cast<InfiniteLattice*>((*itN)->copy());
Dbg::dbg << " *itN: " << dynamic_cast<InfiniteLattice*>(*itN)->str(" ") << endl;
Dbg::dbg << " *itC: " << dynamic_cast<InfiniteLattice*>(*itC)->str(" ") << endl;
meetResult->meetUpdate(*itC);
Dbg::dbg << " meetResult: " << meetResult->str(" ") << endl;
// Widen the resulting meet
modified = dynamic_cast<InfiniteLattice*>(*itN)->widenUpdate(meetResult);
delete meetResult;
}
}
if(analysisDebugLevel>=1) {
if(modified)
{
Dbg::dbg << " Next node's in-data modified. Adding..."<<endl;
int j=0;
for(itN = nextNodeState.begin(); itN != nextNodeState.end(); itN++, j++)
{
Dbg::dbg << " Propagated: Lattice "<<j<<": \n "<<(*itN)->str(" ")<<endl;
}
}
else
Dbg::dbg << " No modification on this node"<<endl;
}
return modified;
}
class IntraUniDirectionalDataflow : public IntraUnitDataflow
{
public:
protected:
// propagates the dataflow info from the current node's NodeState (curNodeState) to the next node's NodeState (nextNodeState)
// return true if any state is modified.
bool propagateStateToNextNode(
const std::vector<Lattice*>& curNodeState, DataflowNode curDFNode, int nodeIndex,
const std::vector<Lattice*>& nextNodeState, DataflowNode nextDFNode);
}
向後過程內資料流分析:例如,活躍性分析(使用 --> 向後 --> 定義)
- 類IntraBWDataflow : public IntraUniDirectionalDataflow
class LiveDeadVarsAnalysis : public IntraBWDataflow {
protected:
funcSideEffectUses* fseu;
public:
LiveDeadVarsAnalysis(SgProject *project, funcSideEffectUses* fseu=NULL);
// Generates the initial lattice state for the given dataflow node, in the given function, with the given NodeState
void genInitState(const Function& func, const DataflowNode& n, const NodeState& state,
std::vector<Lattice*>& initLattices, std::vector<NodeFact*>& initFacts);
boost::shared_ptr<IntraDFTransferVisitor> getTransferVisitor(const Function& func, const DataflowNode& n,
NodeState& state, const std::vector<Lattice*>& dfInfo)
{ return boost::shared_ptr<IntraDFTransferVisitor>(new LiveDeadVarsTransfer(func, n, state, dfInfo, fseu)); }
bool transfer(const Function& func, const DataflowNode& n, NodeState& state, const std::vector<Lattice*>& dfInfo) { assert(0); return false; }
};
關鍵:應用於呼叫點的轉移函式,用於在函式邊界之間執行適當的狀態轉移。
void IntraFWDataflow::transferFunctionCall(const Function &func, const DataflowNode &n, NodeState *state)
{
vector<Lattice*> dfInfoBelow = state->getLatticeBelow(this);
vector<Lattice*>* retState = NULL;
dynamic_cast<InterProceduralDataflow*>(interAnalysis)->
transfer(func, n, *state, dfInfoBelow, &retState, true);
if(retState && !(retState->size()==0 || (retState->size() == dfInfoBelow.size()))) {
Dbg::dbg << "#retState="<<retState->size()<<endl;
for(vector<Lattice*>::iterator ml=retState->begin(); ml!=retState->end(); ml++)
Dbg::dbg << " "<<(*ml)->str(" ")<<endl;
Dbg::dbg << "#dfInfoBelow="<<dfInfoBelow.size()<<endl;
for(vector<Lattice*>::const_iterator l=dfInfoBelow.begin(); l!=dfInfoBelow.end(); l++)
Dbg::dbg << " "<<(*l)->str(" ")<<endl;
}
// Incorporate information about the function's return value into the caller's dataflow state
// as the information of the SgFunctionCallExp
ROSE_ASSERT(retState==NULL || retState->size()==0 || (retState->size() == dfInfoBelow.size()));
if(retState) {
vector<Lattice*>::iterator lRet;
vector<Lattice*>::const_iterator lDF;
for(lRet=retState->begin(), lDF=dfInfoBelow.begin();
lRet!=retState->end(); lRet++, lDF++) {
Dbg::dbg << " lDF Before="<<(*lDF)->str(" ")<<endl;
Dbg::dbg << " lRet Before="<<(*lRet)->str(" ")<<endl;
(*lDF)->unProject(isSgFunctionCallExp(n.getNode()), *lRet);
Dbg::dbg << " lDF After="<<(*lDF)->str(" ")<<endl;
}
}
}
InterProceduralDataflow::InterProceduralDataflow(IntraProceduralDataflow* intraDataflowAnalysis) :
InterProceduralAnalysis((IntraProceduralAnalysis*)intraDataflowAnalysis)
// !!! NOTE: cfgForEnd() AND cfgForBeginning() PRODUCE THE SAME SgFunctionDefinition SgNode BUT THE DIFFERENT INDEXES
// !!! (0 FOR BEGINNING AND 3 FOR END). AS SUCH, IT DOESN'T MATTER WHICH ONE WE CHOOSE. HOWEVER, IT DOES MATTER
// !!! WHETHER WE CALL genInitState TO GENERATE THE STATE BELOW THE NODE (START OF THE FUNCTION) OR ABOVE IT
// !!! (END OF THE FUNCTION). THE CAPABILITY TO DIFFERENTIATE THE TWO CASES NEEDS TO BE ADDED TO genInitState
// !!! AND WHEN IT IS, WE'LL NEED TO CALL IT INDEPENDENTLY FOR cfgForEnd() AND cfgForBeginning() AND ALSO TO MAKE
// !!! TO SET THE LATTICES ABOVE THE ANALYSIS
待辦事項:函式定義的開始和結束問題在此處提及
最簡單的形式:根本沒有呼叫點的轉移操作。
class UnstructuredPassInterDataflow : virtual public InterProceduralDataflow
{
public:
UnstructuredPassInterDataflow(IntraProceduralDataflow* intraDataflowAnalysis)
: InterProceduralAnalysis((IntraProceduralAnalysis*)intraDataflowAnalysis), InterProceduralDataflow(intraDataflowAnalysis)
{}
// the transfer function that is applied to SgFunctionCallExp nodes to perform the appropriate state transfers
// fw - =true if this is a forward analysis and =false if this is a backward analysis
// n - the dataflow node that is being processed
// state - the NodeState object that describes the dataflow state immediately before (if fw=true) or immediately after
// (if fw=false) the SgFunctionCallExp node, as established by earlier analysis passes
// dfInfo - the Lattices that this transfer function operates on. The function propagates them
// to the calling function and overwrites them with the dataflow result of calling this function.
// retState - Pointer reference to a Lattice* vector that will be assigned to point to the lattices of
// the function call's return value. The callee may not modify these lattices.
// Returns true if any of the input lattices changed as a result of the transfer function and
// false otherwise.
bool transfer(const Function& func, const DataflowNode& n, NodeState& state,
const std::vector<Lattice*>& dfInfo, std::vector<Lattice*>** retState, bool fw)
{
return false;
}
void runAnalysis();
};
// simply call intra-procedural analysis on each function one by one.
void UnstructuredPassInterDataflow::runAnalysis()
{
set<FunctionState*> allFuncs = FunctionState::getAllDefinedFuncs();
// iterate over all functions with bodies
for(set<FunctionState*>::iterator it=allFuncs.begin(); it!=allFuncs.end(); it++)
{
const Function& func = (*it)->func;
FunctionState* fState = FunctionState::getDefinedFuncState(func);
// Call the current intra-procedural dataflow as if it were a generic analysi
intraAnalysis->runAnalysis(func, &(fState->state));
}
}
待辦事項
直接呼叫:在給定函式上執行過程內分析,如果函式的節點狀態因此被修改則返回 true,否則返回 false 狀態 - 函式的節點狀態
- bool IntraUniDirectionalDataflow::runAnalysis(const Function& func, NodeState* state, bool analyzeDueToCallers, std::set<Function> calleesUpdated);
- 使用更簡單的引數列表進行直接呼叫:不可行,所有過程內分析都必須在內部設定過程間分析!
bool IntraProceduralDataflow::runAnalysis(const Function& func, NodeState* state)
{
// Each function is analyzed as if it were called directly by the language's runtime, ignoring
// the application's actual call graph
bool analyzeDueToCallers = true;
// We ignore the application's call graph, so it doesn't matter whether this function calls other functions
std::set<Function> calleesUpdated;
return runAnalysis(func, state, analyzeDueToCallers, calleesUpdated);
}
透過非結構化傳遞過程間資料流類呼叫簡單的過程內分析。
int main()
{
SgProject* project = frontend(argc,argv);
initAnalysis(project);
// prepare debugging support
Dbg::init("Live dead variable analysis Test", ".", "index.html");
liveDeadAnalysisDebugLevel = 1;
analysisDebugLevel = 1;
// basis analysis
LiveDeadVarsAnalysis ldva(project);
// wrap it inside the unstructured inter-procedural data flow
UnstructuredPassInterDataflow ciipd_ldva(&ldva);
ciipd_ldva.runAnalysis();
.....
}
示例程式碼
// Initialize vars to hold all the variables and expressions that are live at DataflowNode n
//void getAllLiveVarsAt(LiveDeadVarsAnalysis* ldva, const DataflowNode& n, const NodeState& state, set<varID>& vars, string indent)
void getAllLiveVarsAt(LiveDeadVarsAnalysis* ldva, const NodeState& state, set<varID>& vars, string indent)
{
LiveVarsLattice* liveLAbove = dynamic_cast<LiveVarsLattice*>(*(state.getLatticeAbove(ldva).begin()));
LiveVarsLattice* liveLBelow = dynamic_cast<LiveVarsLattice*>(*(state.getLatticeBelow(ldva).begin()));
// The set of live vars AT this node is the union of vars that are live above it and below it
for(set<varID>::iterator var=liveLAbove->liveVars.begin(); var!=liveLAbove->liveVars.end(); var++)
vars.insert(*var);
for(set<varID>::iterator var=liveLBelow->liveVars.begin(); var!=liveLBelow->liveVars.end(); var++)
vars.insert(*var);
}
必須有一種方法來測試分析結果是否正確。
我們目前使用一種原始的方法來測試分析的正確性:比較pragma和格字串輸出。
兩個示例翻譯器測試分析正確性(比較pragma和格字串輸出)
- https://github.com/rose-compiler/rose/blob/master/tests/roseTests/programAnalysisTests/generalDataFlowAnalysisTests/liveDeadVarAnalysisTest.C
- https://github.com/rose-compiler/rose/blob/master/tests/roseTests/programAnalysisTests/generalDataFlowAnalysisTests/constantPropagationTest.C
活動性分析正確性的示例測試輸入檔案
- https://github.com/rose-compiler/rose/blob/master/tests/roseTests/programAnalysisTests/generalDataFlowAnalysisTests/test5.C
int bar(int flag)
{
int a =1,b,c;
#pragma rose [LiveVarsLattice: liveVars=[flag, a, b]]
if (flag == 0) // flag is only read here, not written!
c = a;
else
c = b;
return c;
}
開啟它
liveDeadAnalysisDebugLevel = 1;
analysisDebugLevel = 1;
// find code with
if(analysisDebugLevel>=1) ...
使用瀏覽器檢查網頁轉儲。
firefox index.html
如何閱讀跟蹤檔案:從頭開始:資訊根據訪問的CFG節點排序。順序可以是前向或後向順序。首先檢查順序是否正確,然後檢查每個訪問的節點。
==================================
Copying incoming Lattice 0:
[LiveVarsLattice: liveVars=[b]]
To outgoing Lattice 0:
[LiveVarsLattice: liveVars=[]]
==================================
Transferring the outgoing Lattice ...
liveLat=[LiveVarsLattice: liveVars=[b]]
Dead Expression
usedVars=<>
assignedVars=<>
assignedExprs=<>
#usedVars=0 #assignedExprs=0
Transferred: outgoing Lattice 0:
[LiveVarsLattice: liveVars=[b]]
transferred, modified=0
==================================
Propagating/Merging the outgoing Lattice to all descendant nodes ...
Descendants (1):
~~~~~~~~~~~~
Descendant: 0x2b9e8c47f010[SgIfStmt | if(flag == 0) c = a;else c = b;]
Propagating to Next Node: 0x2b9e8c47f010[SgIfStmt | if(flag == 0) c = a;else c = b;]
Cur node: Lattice 0:
[LiveVarsLattice: liveVars=[b]]
Next node: Lattice 0:
[LiveVarsLattice: liveVars=[a]]
Next node's in-data modified. Adding...
Propagated: Lattice 0:
[LiveVarsLattice: liveVars=[a, b]]
propagated/merged, modified=1
^^^^^^^^^^^^^^^^^^
A real example: if (flag) c = a; else c = b; // liveness analysis, a, b are live in two branches, they are propagated backward to if-stmt
------------------
Descendants (1): // from c =a back to if-stmt (next node)
~~~~~~~~~~~~
Descendant: 0x2ac8bb95c010[SgIfStmt | if(flag == 0) c = a;else c = b;]
Propagating to Next Node: 0x2ac8bb95c010[SgIfStmt | if(flag == 0) c = a;else c = b;]
Cur node: Lattice 0:
[LiveVarsLattice: liveVars=[a]] // current node's lattice
Next node: Lattice 0:
[LiveVarsLattice: liveVars=[]] // next node's lattice before propagation
Next node's in-data modified. Adding...
Propagated: Lattice 0:
[LiveVarsLattice: liveVars=[a]] // propagate a into if-stmt's lattice
propagated, modified=1
^^^^^^^^^^^^^^^^^^
------------------
Descendants (1): // from c = b --> if-stmt
~~~~~~~~~~~~
Descendant: 0x2ac8bb95c010[SgIfStmt | if(flag == 0) c = a;else c = b;]
Propagating to Next Node: 0x2ac8bb95c010[SgIfStmt | if(flag == 0) c = a;else c = b;]
Cur node: Lattice 0:
[LiveVarsLattice: liveVars=[b]]
Next node: Lattice 0:
[LiveVarsLattice: liveVars=[a]]
Next node's in-data modified. Adding...
Propagated: Lattice 0:
[LiveVarsLattice: liveVars=[a, b]] // now both a and b are propagated/ merged
propagated, modified=1
^^^^^^^^^^^^^^^^^^
提供了一個類analysisStatesToDot來生成帶有格資訊的CFG點圖。
//AnalysisDebuggingUtils.C
class analysisStatesToDOT : public UnstructuredPassIntraAnalysis
{
private:
// LiveDeadVarsAnalysis* lda; // reference to the source analysis
Analysis* lda; // reference to the source analysis
void printEdge(const DataflowEdge& e); // print data flow edge
void printNode(const DataflowNode& n, std::string state_string); // print date flow node
void visit(const Function& func, const DataflowNode& n, NodeState& state); // visitor function
public:
std::ostream* ostr;
analysisStatesToDOT (Analysis* l): lda(l){ };
};
namespace Dbg
{
//....
void dotGraphGenerator (::Analysis *a)
{
::analysisStatesToDOT eas(a);
IntraAnalysisResultsToDotFiles upia_eas(eas);
upia_eas.runAnalysis();
}
} // namespace Dbg
// Liao, 12/6/2011
#include "rose.h"
#include <list>
#include <sstream>
#include <iostream>
#include <fstream>
#include <string>
#include <map>
using namespace std;
// TODO group them into one header
#include "genericDataflowCommon.h"
#include "VirtualCFGIterator.h"
#include "cfgUtils.h"
#include "CallGraphTraverse.h"
#include "analysisCommon.h"
#include "analysis.h"
#include "dataflow.h"
#include "latticeFull.h"
#include "printAnalysisStates.h"
#include "liveDeadVarAnalysis.h"
int numFails = 0, numPass = 0;
//-----------------------------------------------------------
int
main( int argc, char * argv[] )
{
SgProject* project = frontend(argc,argv);
initAnalysis(project);
// generating index.html for tracing the analysis
Dbg::init("Live dead variable analysis Test", ".", "index.html");
liveDeadAnalysisDebugLevel = 1;
analysisDebugLevel = 1;
LiveDeadVarsAnalysis ldva(project);
UnstructuredPassInterDataflow ciipd_ldva(&ldva);
ciipd_ldva.runAnalysis();
// Output the dot graph *********************
Dbg::dotGraphGenerator (&ldva);
return 0;
}
- 難以使用生成的格,因為在格中生成了許多臨時表示式物件。但通常使用者並不關心它們(常量傳播、指標分析)。
- 要檢視問題:轉到[build64/tests/roseTests/programAnalysisTests/generalDataFlowAnalysisTests]
- 執行make check
- 檢視分析的點圖轉儲:run.sh test_ptr4.C_main_0x2b41e651c038_cfg.dot