Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance #290

Open
VitalyVaryvdin opened this issue Sep 24, 2021 · 8 comments
Open

Performance #290

VitalyVaryvdin opened this issue Sep 24, 2021 · 8 comments

Comments

@VitalyVaryvdin
Copy link

VitalyVaryvdin commented Sep 24, 2021

Rendering big dataset of candlestick data drops FPS quite a lot.

50k entries dataset brings my FPS down to 30. Running twice as much decreases it even more.
Running GLFW + OGL 3.3. Windows 10, 2080Ti. MSAA & software anti-aliasing is disabled.

Candlestick rendering is taken from implot_demos repository. Update function doesn't do anything else than rendering.
How can I improve my performance? Is there anything like instancing? Or maybe switching to different imgui backend might improve my performance?

@hesa2020
Copy link

I assume you try to render all the candles even if they are not within the viewport.

My first guess would be to try to render only by chunks the candles that are within the viewable area.

I will experiment with this in a few hours and will post my code if i can get it to render hundreds of thousands of candles :)

@VitalyVaryvdin
Copy link
Author

Well, I was actually hoping to have good performance when zoomed out on large datasets as well :)

But I'd like to see a snippet how to do proper culling on them as well.

@hesa2020
Copy link

hesa2020 commented Sep 26, 2021

ok, i am not sure how to limit the zoom, seems like i can get a starting zoom but i cant limit it.

What I would do is to:

  1. Limit X axis based on a fixed number of candles to show at once.
  2. if user wish to view a bigger interval he need to switch candle size lets say from a minute to an hour/month/etc.

However i did BOOST my FPS by rendering only the candles that are in the viewport, and seems like the FitPoint was also needed to be limited because it was dropping my FPS by a ton.

I was dealing with a dataset of 1 million candles which wasn't impacting my performance by much.

I am not 100% confident I used the right variable I am fairly new to this library but it looks like it gets the job done maybe @epezent Can confirm that.

Here is the function I ended up with:

void PlotCandlestick(const char* label_id, const double* xs, const double* opens, const double* closes, const double* lows, const double* highs, int count, bool tooltip, float width_percent, ImVec4 bullCol, ImVec4 bearCol)
{
    // get ImGui window DrawList
    ImDrawList* draw_list = ImPlot::GetPlotDrawList();
    // calc real value width
    double half_width = count > 1 ? (xs[1] - xs[0]) * width_percent : width_percent;

    // custom tool
    if (ImPlot::IsPlotHovered() && tooltip)
    {
        ImPlotPoint mouse = ImPlot::GetPlotMousePos();
        mouse.x = ImPlot::RoundTime(ImPlotTime::FromDouble(mouse.x), ImPlotTimeUnit_Day).ToDouble();
        float  tool_l = ImPlot::PlotToPixels(mouse.x - half_width * 1.5, mouse.y).x;
        float  tool_r = ImPlot::PlotToPixels(mouse.x + half_width * 1.5, mouse.y).x;
        float  tool_t = ImPlot::GetPlotPos().y;
        float  tool_b = tool_t + ImPlot::GetPlotSize().y;
        ImPlot::PushPlotClipRect();
        draw_list->AddRectFilled(ImVec2(tool_l, tool_t), ImVec2(tool_r, tool_b), IM_COL32(128, 128, 128, 64));
        ImPlot::PopPlotClipRect();
        // find mouse location index
        int idx = BinarySearch(xs, 0, count - 1, mouse.x);
        // render tool tip (won't be affected by plot clip rect)
        if (idx != -1)
        {
            ImGui::BeginTooltip();
            char buff[32];
            ImPlot::FormatDate(ImPlotTime::FromDouble(xs[idx]), buff, 32, ImPlotDateFmt_DayMoYr, ImPlot::GetStyle().UseISO8601);
            ImGui::Text("Day:   %s", buff);
            ImGui::Text("Open:  $%.2f", opens[idx]);
            ImGui::Text("Close: $%.2f", closes[idx]);
            ImGui::Text("Low:   $%.2f", lows[idx]);
            ImGui::Text("High:  $%.2f", highs[idx]);
            ImGui::EndTooltip();
        }
    }

    // begin plot item
    if (ImPlot::BeginItem(label_id))
    {
        // override legend icon color
        ImPlot::GetCurrentItem()->Color = IM_COL32(64, 64, 64, 255);

        ImPlotContext& gp = *GImPlot;
        ImPlotPoint plot_start = ImPlot::PixelsToPlot(gp.CurrentPlot->AxesRect.Min.x, 0);
        ImPlotPoint plot_end = ImPlot::PixelsToPlot(gp.CurrentPlot->AxesRect.Max.x, 0);
        // fit data if requested
        if (ImPlot::FitThisFrame())
        {
            for (int i = 0; i < count; ++i)
            {
                if (xs[i] >= plot_start.x && xs[i] <= plot_end.x)
                {
                    ImPlot::FitPoint(ImPlotPoint(xs[i], lows[i]));
                    ImPlot::FitPoint(ImPlotPoint(xs[i], highs[i]));
                }
            }
        }
        // render data
        for (int i = 0; i < count; ++i)
        {
            if (xs[i] >= plot_start.x && xs[i] <= plot_end.x)
            {
                ImVec2 open_pos = ImPlot::PlotToPixels(xs[i] - half_width, opens[i]);
                ImVec2 close_pos = ImPlot::PlotToPixels(xs[i] + half_width, closes[i]);
                ImVec2 low_pos = ImPlot::PlotToPixels(xs[i], lows[i]);
                ImVec2 high_pos = ImPlot::PlotToPixels(xs[i], highs[i]);
                ImU32 color = ImGui::GetColorU32(opens[i] > closes[i] ? bearCol : bullCol);
                draw_list->AddLine(low_pos, high_pos, color);
                draw_list->AddRectFilled(open_pos, close_pos, color);
            }
        }
        // end plot item
        ImPlot::EndItem();
    }
}

NOTE: I rendered 7579 candles at full FPS until ImGui Assert crash with:
Too many vertices in ImDrawList using 16-bit indices. Read comment above

Maybe ImPlot should figure a way to split draw lists if more vertices queued than possible to render with imgui.

@hinxx
Copy link

hinxx commented Oct 26, 2021

Too many vertices in ImDrawList using 16-bit indices. Read comment above

I think this can be dealt with. See imGui imconfig.h:


//---- Use 32-bit vertex indices (default is 16-bit) is one way to allow large meshes with more than 64K vertices.
// Your renderer backend will need to support it (most example renderer backends support both 16/32-bit indices).
// Another way to allow large meshes while keeping 16-bit indices is to handle ImDrawCmd::VtxOffset in your renderer.
// Read about ImGuiBackendFlags_RendererHasVtxOffset for details.
//#define ImDrawIdx unsigned int

@hinxx
Copy link

hinxx commented Oct 26, 2021

Sorry for hijacking this thread but I'm looking at getting more performance out of my app, too.

I will have multiple data streams coming in over TCP/IP at update rate ~14 Hz max. The individual data traces would be in 10k .. 500k points. I'm drawing with basic PlotLine at the moment. I'm running an ImGui example with calls to ImPlot for the test, default FPS is 60. On my machine two traces of 100k each still retain good FPS at around 60, but one of the CPU cores is at ~100%. I would like to keep FPS high.

Given that my data is likely to be the same over ~4 iterations of render loop (60/14) I was thinking of caching the computation results from the RenderLineStrip() (at a glance this is where draw list is constructed) until a new data comes from the network. This idea comes after looking at perf output that points to PlotLine and glibc memmove as biggest CPU cycle consumers.

Maybe using other 7 cores on the CPU for PlotLine computation would be an avenue to explore to keep the FPS high.

Also, rendering only subset of points that are actually on screen is an option for me, some times, depending on what the user would be looking at.

Any comments on the above are welcomed!

@gorbatschow
Copy link

@hinxx
When there are thousands points to render it make sense to use downsampling before plotting.
For example using this algorithm https://github.com/sveinn-steinarsson/flot-downsample
demo here https://www.base.is/flot/
and c++ port https://gist.github.com/gorbatschow/ce36c15d9265b61d12a1be1783bf0abf

@hinxx
Copy link

hinxx commented Nov 11, 2021

That look like a very good approach for me @gorbatschow ! Will test and report ASAP.

@hinxx
Copy link

hinxx commented Nov 11, 2021

I can shave off half of CPU cycles from a 100 000 points, that would originally be plotted, when using some small threshold (i.e. 1000). It is interesting to see that using threshold of 100 or 10 000 makes results in negligible CPU usage change compared to 1000.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants