History log of /gem5/src/arch/x86/isa/microops/fpop.isa
Revision Date Author Comments
# 13613:a19963be12ca 20-Nov-2018 Gabe Black <gabeblack@google.com>

x86: Stop using/defining some ISA specific register types.

These have been replaced with the generic RegVal type.

Change-Id: I75c1134212067dea43aa0903d813633e06f3d6c6
Reviewed-on: https://gem5-review.googlesource.com/c/14476
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 13441:d70ffc3dabf0 20-Nov-2018 Gabe Black <gabeblack@google.com>

x86: Get rid of a problematic DPRINTF in PremFp.

This DPRINTF shouldn't be necessary since it shows the operands and
results of the instruction which the trace should already make
available. Also by passing the destination register to DPRINTF, the ISA
parser will assume that it's also a source when tracking dependencies.

Change-Id: I820387c82578bdbb8d2e3d91652a6c0185077f54
Reviewed-on: https://gem5-review.googlesource.com/c/14475
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Gabe Black <gabeblack@google.com>


# 12236:126ac9da6050 04-Nov-2017 Gabe Black <gabeblack@google.com>

alpha,arm,mips,power,riscv,sparc,x86: Merge exec decl templates.

In the ISA instruction definitions, some classes were declared with
execute, etc., functions outside of the main template because they
had CPU specific signatures and would need to be duplicated with
each CPU plugged into them. Now that the instructions always just
use an ExecContext, there's no reason for those templates to be
separate. This change folds those templates together.

Change-Id: I13bda247d3d1cc07c0ea06968e48aa5b4aace7fa
Reviewed-on: https://gem5-review.googlesource.com/5401
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Alec Roelke <ar4jc@virginia.edu>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 12234:78ece221f9f5 02-Nov-2017 Gabe Black <gabeblack@google.com>

alpha,arm,mips,power,riscv,sparc,x86,isa: De-specialize ExecContexts.

The ISA parser used to generate different copies of exec functions
for each exec context class a particular CPU wanted to use. That's
since been changed so that those functions take a pointer to the base
ExecContext, so the code which would generate those extra functions
can be removed, and some functions which used to be templated on an
ExecContext subclass can be untemplated, or minimally less templated.

Now that some functions aren't going to be instantiated multiple times
with different signatures, there are also opportunities to collapse
templates and make many instruction definitions simpler within the
parser. Since those changes will be less mechanical, they're left for
later changes and will probably be done in smaller increments.

Change-Id: I0015307bb02dfb9c60380b56d2a820f12169ebea
Reviewed-on: https://gem5-review.googlesource.com/5381
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>


# 11159:9459593cb649 06-Oct-2015 Steve Reinhardt <steve.reinhardt@amd.com>

x86: implement fild, fucomi, and fucomip x87 insts

fild loads an integer value into the x87 top of stack register.
fucomi/fucomip compare two x87 register values (the latter
also doing a stack pop).
These instructions are used by some versions of GNU libstdc++.


# 10313:01dda09b93e5 01-Sep-2014 Nilay Vaish <nilay@cs.wisc.edu>

x86: set op class of two fp instructions
This patch sets op class of two fp instructions: movfp and pop x87 stack
as IntAluOp since these instructions do not make use of the fp alu.


# 10196:be0e1724eb39 09-May-2014 Curtis Dunham <Curtis.Dunham@arm.com>

arch: teach ISA parser how to split code across files

This patch encompasses several interrelated and interdependent changes
to the ISA generation step. The end goal is to reduce the size of the
generated compilation units for instruction execution and decoding so
that batch compilation can proceed with all CPUs active without
exhausting physical memory.

The ISA parser (src/arch/isa_parser.py) has been improved so that it can
accept 'split [output_type];' directives at the top level of the grammar
and 'split(output_type)' python calls within 'exec {{ ... }}' blocks.
This has the effect of "splitting" the files into smaller compilation
units. I use air-quotes around "splitting" because the files themselves
are not split, but preprocessing directives are inserted to have the same
effect.

Architecturally, the ISA parser has had some changes in how it works.
In general, it emits code sooner. It doesn't generate per-CPU files,
and instead defers to the C preprocessor to create the duplicate copies
for each CPU type. Likewise there are more files emitted and the C
preprocessor does more substitution that used to be done by the ISA parser.

Finally, the build system (SCons) needs to be able to cope with a
dynamic list of source files coming out of the ISA parser. The changes
to the SCons{cript,truct} files support this. In broad strokes, the
targets requested on the command line are hidden from SCons until all
the build dependencies are determined, otherwise it would try, realize
it can't reach the goal, and terminate in failure. Since build steps
(i.e. running the ISA parser) must be taken to determine the file list,
several new build stages have been inserted at the very start of the
build. First, the build dependencies from the ISA parser will be emitted
to arch/$ISA/generated/inc.d, which is then read by a new SCons builder
to finalize the dependencies. (Once inc.d exists, the ISA parser will not
need to be run to complete this step.) Once the dependencies are known,
the 'Environments' are made by the makeEnv() function. This function used
to be called before the build began but now happens during the build.
It is easy to see that this step is quite slow; this is a known issue
and it's important to realize that it was already slow, but there was
no obvious cause to attribute it to since nothing was displayed to the
terminal. Since new steps that used to be performed serially are now in a
potentially-parallel build phase, the pathname handling in the SCons scripts
has been tightened up to deal with chdir() race conditions. In general,
pathnames are computed earlier and more likely to be stored, passed around,
and processed as absolute paths rather than relative paths. In the end,
some of these issues had to be fixed by inserting serializing dependencies
in the build.

Minor note:
For the null ISA, we just provide a dummy inc.d so SCons is never
compelled to try to generate it. While it seems slightly wrong to have
anything in src/arch/*/generated (i.e. a non-generated 'generated' file),
it's by far the simplest solution.


# 10184:bbfa3152bdea 09-May-2014 Curtis Dunham <Curtis.Dunham@arm.com>

arch: remove inline specifiers on all inst constrs, all ISAs

With (upcoming) separate compilation, they are useless. Only
link-time optimization could re-inline them, but ideally
feedback-directed optimization would choose to do so only for
profitable (i.e. common) instructions.


# 9894:c0a3920859bd 29-Sep-2013 Andreas Sandberg <andreas@sandberg.pp.se>

x86: Add support for loading 32-bit and 80-bit floats in the x87

The x87 FPU supports three floating point formats: 32-bit, 64-bit, and
80-bit floats. The current gem5 implementation supports 32-bit and
64-bit floats, but only works correctly for 64-bit floats. This
changeset fixes the 32-bit float handling by correctly loading and
rounding (using truncation) 32-bit floats instead of simply truncating
the bit pattern.

80-bit floats are loaded by first loading the 80-bits of the float to
two temporary integer registers. A micro-op (cvtint_fp80) then
converts the contents of the two integer registers to the internal FP
representation (double). Similarly, when storing an 80-bit float,
there are two conversion routines (ctvfp80h_int and cvtfp80l_int) that
convert an internal FP register to 80-bit and stores the upper 64-bits
or lower 32-bits to an integer register, which is the written to
memory using normal integer stores.


# 9893:5924b77fb8fc 30-Sep-2013 Andreas Sandberg <andreas@sandberg.pp.se>

x86: Fix re-entrancy problems in x87 store instructions

X87 store instructions typically loads and pops the top value of the
stack and stores it in memory. The current implementation pops the
stack at the same time as the floating point value is loaded to a
temporary register. This will corrupt the state of the x87 stack if
the store fails. This changeset introduces a pop87 micro-instruction
that pops the stack and uses this instruction in the affected
macro-instructions to pop the stack after storing the value to memory.


# 9765:da0e0df0ba97 18-Jun-2013 Andreas Sandberg <andreas@sandberg.pp.se>

x86: Add support for maintaining the x87 tag word

The current implementation of the x87 never updates the x87 tag
word. This is currently not a big issue since the simulated x87 never
checks for stack overflows, however this becomes an issue when
switching between a virtualized CPU and a simulated CPU. This
changeset adds support, which is enabled by default, for updating the
tag register to every floating point microop that updates the stack
top using the spm mechanism.

The new tag words is generated by the helper function
X86ISA::genX87Tags(). This function is currently limited to flagging a
stack position as valid or invalid and does not try to distinguish
between the valid, zero, and special states.


# 9761:f2102d45a753 18-Jun-2013 Andreas Sandberg <andreas@sandberg.pp.se>

x86: Make fprem like the fprem on a real x87

The current implementation of fprem simply does an fmod and doesn't
simulate any of the iterative behavior in a real fprem. This isn't
normally a problem, however, it can lead to problems when switching
between CPU models. If switching from a real CPU in the middle of an
fprem loop to a simulated CPU, the output of the fprem loop becomes
correupted. This changeset changes the fprem implementation to work
like the one on real hardware.


# 9758:353587055aff 18-Jun-2013 Andreas Sandberg <andreas@sandberg.pp.se>

x86: Fix the flag handling code in FABS and FCHS

This changeset fixes two problems in the FABS and FCHS
implementation. First, the ISA parser expects the assignment in
flag_code to be a pure assignment and not an and-assignment, which
leads to the isa_parser omitting the misc reg update. Second, the FCHS
and FABS macro-ops don't set the SetStatus flag, which means that the
default micro-op version, which doesn't update FSW, is executed.


# 9699:76828cbe5de4 21-May-2013 Nilay Vaish <nilay@cs.wisc.edu>

x86: add op class for int and fp microops in isa description
Currently all the integer microops are marked as IntAluOp and the floating
point microops are marked as FloatAddOp. This patch adds support for marking
different microops differently. Now IntMultOp, IntDivOp, FloatDivOp,
FloatMultOp, FloatCvtOp, FloatSqrtOp classes will be used as well. This will
help in providing different latencies for different op class.


# 9582:0632d2d1575c 11-Mar-2013 Nilay Vaish <nilay@cs.wisc.edu>

x86: implement some of the x87 instructions
This patch implements ftan, fprem, fyl2x, fld* floating-point instructions.


# 9470:68f7e0bcf4aa 15-Jan-2013 Nilay Vaish <nilay@cs.wisc.edu>

x86: implement fabs, fchs instructions


# 9371:7c1484cc9b10 30-Dec-2012 Nilay Vaish <nilay@cs.wisc.edu>

x86: implement x87 fp instruction fsincos
This patch implements the fsincos instruction. The code was originally written
by Vince Weaver. Gabe had made some comments about the code, but those were
never addressed. This patch addresses those comments.


# 9211:46c3a74952ec 11-Sep-2012 Nilay Vaish <nilay@cs.wisc.edu>

x86: Add a separate register for D flag bit
The D flag bit is part of the cc flag bit register currently. But since it
is not being used any where in the implementation, it creates an unnecessary
dependency. Hence, it is being moved to a separate register.


# 9010:7891b96e1526 22-May-2012 Nilay Vaish <nilay@cs.wisc.edu>

X86: Split Condition Code register
This patch moves the ECF and EZF bits to individual registers (ecfBit and
ezfBit) and the CF and OF bits to cfofFlag registers. This is being done
so as to lower the read after write dependencies on the the condition code
register. Ultimately we will have the following registers [ZAPS], [OF],
[CF], [ECF], [EZF] and [DF]. Note that this is only one part of the
solution for lowering the dependencies. The other part will check whether
or not the condition code register needs to be actually read. This would
be done through a separate patch.


# 8946:fb6c89334b86 14-Apr-2012 Andreas Hansson <andreas.hansson@arm.com>

clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6

This patch addresses a number of minor issues that cause problems when
compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it
avoids using the deprecated ext/hash_map and instead uses
unordered_map (and similarly so for the hash_set). To make use of the
new STL containers, g++ and clang has to be invoked with "-std=c++0x",
and this is now added for all gcc versions >= 4.6, and for clang >=
3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1
unordered_map to avoid the deprecation warning.

The addition of c++0x in turn causes a few problems, as the
compiler is more stringent and adds a number of new warnings. Below,
the most important issues are enumerated:

1) the use of namespaces is more strict, e.g. for isnan, and all
headers opening the entire namespace std are now fixed.

2) another other issue caused by the more stringent compiler is the
narrowing of the embedded python, which used to be a char array,
and is now unsigned char since there were values larger than 128.

3) a particularly odd issue that arose with the new c++0x behaviour is
found in range.hh, where the operator< causes gcc to complain about
the template type parsing (the "<" is interpreted as the beginning
of a template argument), and the problem seems to be related to the
begin/end members introduced for the range-type iteration, which is
a new feature in c++11.

As a minor update, this patch also fixes the build flags for the clang
debug target that used to be shared with gcc and incorrectly use
"-ggdb".


# 8588:ef28ed90449d 27-Sep-2011 Gabe Black <gblack@eecs.umich.edu>

ISA parser: Use '_' instead of '.' to delimit type modifiers on operands.

By using an underscore, the "." is still available and can unambiguously be
used to refer to members of a structure if an operand is a structure, class,
etc. This change mostly just replaces the appropriate "."s with "_"s, but
there were also a few places where the ISA descriptions where handling the
extensions themselves and had their own regular expressions to update. The
regular expressions in the isa parser were updated as well. It also now
looks for one of the defined type extensions specifically after connecting "_"
where before it would look for any sequence of characters after a "."
following an operand name and try to use it as the extension. This helps to
disambiguate cases where a "_" may legitimately be part of an operand name but
not separate the name from the type suffix.

Because leaving the "_" and suffix on the variable name still leaves a valid
C++ identifier and all extensions need to be consistent in a given context, I
considered leaving them on as a breadcrumb that would show what the intended
type was for that operand. Unfortunately the operands can be referred to in
code templates, the Mem operand in particular, and since the exact type of Mem
can be different for different uses of the same template, that broke things.


# 7626:bdd926760470 23-Aug-2010 Gabe Black <gblack@eecs.umich.edu>

X86: Get rid of the flagless microop constructor.

This will reduce clutter in the source and hopefully speed up compilation.


# 7620:3d8a23caa1ef 23-Aug-2010 Gabe Black <gblack@eecs.umich.edu>

X86: Consolidate extra microop flags into one parameter.

This single parameter replaces the collection of bools that set up various
flavors of microops. A flag parameter also allows other flags to be set like
the serialize before/after flags, etc., without having to change the
constructor.


# 7087:fb8d5786ff30 24-May-2010 Nathan Binkert <nate@binkert.org>

copyright: Change HP copyright on x86 code to be more friendly


# 6345:f9ae7c3a036c 16-Jul-2009 Gabe Black <gblack@eecs.umich.edu>

X86: Take limitted advantage of the compilers type checking for microop operands.


# 5788:6d4161a36ca1 07-Jan-2009 Gabe Black <gblack@eecs.umich.edu>

X86: Autogenerate macroop generateDisassemble function.


# 5122:b0527f379eb5 03-Oct-2007 Gabe Black <gblack@eecs.umich.edu>

X86: Fix the movfp microop.


# 5083:49559a8060e8 19-Sep-2007 Gabe Black <gblack@eecs.umich.edu>

X86: Move the fp microops to their own file with their own base classes in C++ and python.